Contact Us
Leave a message
Facebook
Twitter
Weibo
Smart taxation: Re-optimization of the technical environment and further improvement of algorithm governance
9 months ago
Source:ThepaperCn

Smart taxation is so capable, will tax evasion and false invoices disappear?

Of course not. Driven by interests, the global struggle between tax evasion and anti-evasion is long-term, and the level of intelligence for tax-related crimes is also increasing. The key road to smart taxation is one foot tall, and there are also many practical difficulties in the technical environment. Artificial intelligence The inherent flaws and risks of the general model and the industry vertical model still exist.

1. External technical environment and risk challenges

Big data analysis is the basis for model cognition, reasoning and decision-making. As an agent, smart tax systems are naturally inseparable from the optimization of big data and their technical operating environment.

First, data integration is difficult.Tax-related data is a semi-public good. In practice, there are many constraints on data openness. As a professional platform, it is very difficult for the smart tax system to integrate all relevant big data. First of all, taxpayer assets and income data involve trade secrets or personal privacy. Most companies are reluctant to directly open the API interface of their own platforms. They can only retrieve data from their own ERP and SAP platforms, and then connect and call it through the tax company's end-side platform. As for horizontal platform data, whether it is basic data, economic data, social data or judicial data, due to confidentiality and security responsibilities, most departments carefully export it in accordance with the principle of "minimum authority". Cross-border tax-related information can only be obtained through CRS systems through peer-to-peer cooperation and cannot be integrated by one department. At present, what can be integrated is only vertical declarations, invoice data and a small amount of horizontal data. Of course, as far as professional systems are concerned, the larger the integrated data is, the better, otherwise it will become a hot potato.

Second, data cleaning and mining are difficult.There are two pain points in digitalization. One is the high cost of data mining. Data that has not been cleaned and desensitized is difficult to integrate and share. However, data cleaning, mining and processing require advanced algorithms and sufficient computing power, which is costly. It is very difficult for the public sector to produce high value-added data products with limited financial resources. At present, the domestic research and development capabilities of large models are strong among big digital companies, and public data is cleaned and mined in a market-oriented manner. Due to the uncertainty of data security and data value in professional departments, the conditions are not mature. Most of the data fed by each algorithm model of the current smart tax system is vertically integrated invoice native data within the system and application data that has been cleaned in vitro. It can only rely on existing computing power to maximize the value of the data and achieve "precision, precision and fine" governance effect.

Third, it is difficult to be compatible with legal principles and mechanisms.Big data technology has reliable algorithms and objective judgments. The results produced by its consistency standard can avoid human subjective interference and achieve the effects of same questions and answers, same cases and same responsibilities for same crimes. However, under the current responsibility system, fairness and absolutization of results will cause people to rely on machines and give up subjective judgment. For example, for the discretion of the penalty standard for tax-related illegal cases, the tax law has a range of 0.5 to 5 times, which is decided by the law enforcement entity based on the circumstances of the case, which is difficult to quantify as a model parameter. The big data conclusion is also the average penalty ratio for similar cases. Or the mode N, but for law enforcement entities, N will naturally become the standard answer for the same case and the same judgment. Since following the crowd means that they will not be held accountable, giving up subjective judgment and obeying consistency becomes their best choice. But this is obviously not the original intention of the law to establish its own discretion.

Moreover, due to the quality flaws and incompleteness of artificial intelligence training data, robotic conclusions will also have data bias, erroneous predictions and unfair discrimination problems. Excessive trust in the accuracy and reliability of the system, long-term reliance on system decision-making, and neglect human beings. Their own professional knowledge and judgment prevent erroneous behaviors from being corrected.

The integration and sharing of big data and consistency standards are conducive to association penetration and joint punishment. However, the wider the integrated and shared information, the more likely it is to be implicated and implicated after multiple negative information is associated, resulting in long vertical traceability periods, multiple horizontal penalties, and problems such as too wide involvement. The legal network is long, sparse but not leaking, and has multiple implications. Not only is it difficult for the law itself to bear, but it is also inconsistent with the spirit of the Third Plenary Session of the 20th Central Committee of the Communist Party of China calling for the establishment of a system for sealing records for minor crimes.

Fourth, it is difficult to eliminate algorithmic discrimination.Due to the information illusion factor of the technology itself, big data models can easily form an algorithm discrimination against certain specific elements. Once this discrimination is formed, the targeting of certain specific areas, specific groups of people or specific items can easily be solidified. Reflected in tax risk prevention and control, for example, in some areas with high incidence of false exports and tax fraud, sensitive products and sensitive groups in history, once traces remain, even if their offline physical credit has been repaired, risk doubts may still be prompted in the system.

Because it is technically difficult to define and eliminate algorithmic discrimination factors in monitoring large models, and artificial intelligence logic algorithms are not highly interpretable, and the output results are clear but the process is opaque, manual intervention requires the cognitive ability and authority of grassroots law enforcement officers. The courage to make decisions and risks, and smart choices will rely on machine decision-making to maintain consistency. However, these once-sensitive areas, products and groups have been unfairly treated due to algorithmic discrimination and have not been corrected for a long time. It may become an unstable factor.

Similarly, in the assessment of tax credit levels and the determination of invoice credit lines, the system may also automatically exclude certain characteristic groups due to factors such as taxpayers 'historical credit, requiring manual intervention and correction.

Fifth, it is difficult to define man-machine responsibilities.Many of the system's operational monitoring of tax law enforcement is achieved in human-computer interaction scenarios. In the era of big models, machines have no obstacles to understanding professional language. Traditional Turing tests cannot distinguish between the traces of human and machine actions. The human-machine action boundaries and responsibility boundaries are blurred due to fusion. This brings new problems, because artificial intelligence can also make mistakes, such as pre-filling tax returns confirmed by taxpayers, or leaking classified taxes during interpersonal interactions. Once legal liability arises and triggers litigation, it is difficult to clearly define whether it is the responsibility of the person or the responsibility of the machine.

For example, the quality of the data pre-filled by the system depends to a large extent on the quality of the declaration data fed during the model training and labeling stages, which in turn will affect taxpayers 'expectations and tax compliance. There is no distance and close service. Instead, it will make the service object feel "loving and afraid". This is also a long-term contradiction between the integrated characteristics of artificial intelligence and the clarity of border governance. Safe, reliable and trustworthy artificial intelligence should not only encourage innovation and fair competition, but also assume the responsibility of safeguarding the equality of citizens 'rights.

Sixth, the technical operating environment is difficult.The smart tax system implements a soft-software integration technology architecture of "localization of technical products and internationalization of algorithm standards". The Web server, Oracle database and Linux operating system running on the basis of domestic cloud platforms are all developed based on international open source software technology. Since the open source technology that can be used for commercial and promotion is relatively backward, the large shell assembly models developed by developing countries drawing on open source technology are not original enough. Training large models requires advanced algorithms, high-end chips and sufficient computing power. Our country's high-end chips have a high degree of international dependence. Under the U.S. policy of "small courtyard and high wall" of AI/Ml technology, model operation and technology upgrades face many "stuck necks" bottlenecks.

Internationally, the pre-training algorithm architecture based on the neural network Transfomer large model and the emergence of the GPT series of superlanguage models, with their realistic natural language interaction, multi-scene content generation capabilities, multimodal understanding capabilities and powerful reasoning capabilities, Achieve the close integration of big data, big computing power, and big algorithms. This puts a very real pressure on the interaction capabilities and technical operating environment of the domestic side-end model and the tax-enterprise interaction model based on CPU and NLP text understanding language. Moreover, Transfomer is a deep neural network that is based on an attention mechanism and can efficiently process sequential data in parallel. Its machine learning concept and paradigm that does not maximize utility functions, does not classify, and does not require labeling, greatly reduces data collection, labeling, Training and computing costs systematically improve operating efficiency, and also pose new challenges to labeling and factorized algorithm models.

In addition, due to data standards and technical environment compatibility issues between tax companies, tax banks and departmental platforms, data can only be connected or exchanged through information transmission methods, and data integration cannot be achieved technically. This issue is also global. In 2023, the European Commission required that the VAT invoice system and electronic declaration system among EU countries should be compatible with each other, but the cost and difficulty of implementation are very high.

Seventh, it is difficult to integrate the "three calculations".Achieving the integration of computing power, computing volume, and algorithms is also the technical environment goal of smart taxation. Among them, the scarcity of computing power is the bottleneck that plagues the development of the digital economy at this stage. Artificial intelligence has an "impossible triangle", that is, under conditions of scarce computing power, it is difficult to quickly integrate information from different data sources and sensors. It is impossible for human-computer interaction to simultaneously realize simultaneous questions and answers, and accurate answers.

my country's total computing power accounts for 31% of the world, ranking second in the world, but its distribution structure is unbalanced. Computing volume is concentrated in the eastern region with many leading enterprises and concentrated network nodes, while computing power is mainly distributed in the western region with abundant power resources. The distribution of computing volume of the smart tax system is consistent with the distribution structure of the digital economy. To this end, the country has implemented the "Eastern Computing and Western Computing" strategy, using western computing resources to undertake eastern data processing services to reduce computing costs. However, due to the long transmission distance,"hot data" is difficult to calculate, and the integration efficiency needs to be improved. In terms of algorithms, the emergence of GPUs and LPUs, whose performance and energy consumption completely crush the CPU in the world, has also put pressure on the system's algorithm upgrade. There is a long way to go to achieve the integration of the three calculations.

Eighth, it is difficult to deal with new risks.After the service scope of the electronic taxation bureau and digital invoice platform has been expanded to natural persons and mobile terminals, the risk exposure of the system has been significantly expanded, and the invoice supervision risks of billions of natural persons have also increased significantly. The original system interactive software and invoice data only flow at the general taxpayer level on the B-side, while the new system is directly aimed at natural persons on the C-side. With more frequent human-computer interactions and more refined tax services, business requirements and application scenarios are more complex. The need for outsourcing or outsourcing development software and calling external open source software will increase, and the dependence on the software supply chain will also increase, and the external risks of the system will increase. However, a new generation of hackers in the artificial intelligence environment can use machine learning algorithms and adaptive systems to identify and utilize the weaknesses and information of professional system models, maliciously generate plausible tax refundable false information, and create information cocoons to weaken taxpayers 'understanding of tax-related behaviors. The ability to recognize and judge behaviors and seek illegal benefits from them.

According to the general rules for confirming false issuance of value-added tax invoices, the flow of goods, capital and invoice that occur in the transaction must be audited. Only goods trade with "three-rate consistency" meets the deduction conditions. However, the popularity of digital payments makes it difficult to effectively implement this rule. The diversion rate of electronic channels in my country's banking industry has reached 97.0%, the annual online business transaction volume exceeds 450 billion, and the number of digital payments on mobile terminals has reached hundreds of trillions. Under the existing constraints of "three calculations", financial regulatory authorities can only supervise large amounts of account funds in real time. Even if banks intend to open data interfaces to taxation, the smart tax system must monitor every taxpayer's fund flow online in real time. It is neither possible nor necessary.

Therefore, in terms of the original high risk of false invoicing, the smart tax system has effectively solved the problems of falsehood, forgery and alteration of invoice carriers, but it is still unable to eliminate the problems of false issuance of real bills and change of bills (content). For example, criminals set up shell enterprises to carry out multi-link circular invoicing, ticket change, and violent false invoicing fraud, and the phenomenon of false invoicing involving e-commerce and digital payment still cannot be discovered and effectively contained in a timely manner.

2. Gradually move into a new realm of algorithm governance in the optimization of the technical environment

From special administrators to separation of tax collection and inspection, from abacus calculators to computer networks, from platform governance, process driven to data-driven, algorithm governance, with the evolution of tax collection and management reform and tax governance models, the road to smart taxation has been arduous and long-term. Faced with the arrival of the fourth technological revolution, the Third Plenary Session of the 20th Central Committee of the Communist Party of China called for continuing to deepen the reform of tax collection and management, and there is still a long way to go to optimize the technical environment of the smart tax system.

First, we must seize the new opportunities of the new round of fiscal and taxation system reform and improve and upgrade the smart tax system in the optimization of the tax system.As a new productive force in the digital era, the digital economy has its new qualitative characteristics of virtualization, integration, and reflection of value existence, which have made it difficult to adapt to the substantive taxation principles of the current industrial and commercial tax system that aim at the real world, clear boundaries, and physical existence. To this end, the Third Plenary Session of the 20th Central Committee of the Communist Party of China proposed to study a tax system that is compatible with the new business format. One of the goals of the new round of fiscal and taxation reform is to explore a new indirect tax system whose structure matches the digital economy's profit model, adapt the integration of tax items and tax rates to the integration of elements of taxation objects, and create a more suitable rule system for new quality productivity. and tax environment. As a technical platform for implementing tax system operation, the smart tax system naturally needs to be optimized and upgraded around new tax system elements and setting new parameters.

The second is to seize the new opportunities of the digital technology revolution, absorb and apply the most advanced international digital technology achievements on the basis of independent innovation, and make the smart tax system more intelligent.At present, my country still has a certain gap with the most advanced international level in terms of chip, software and large model technology, and there is still a long way to go to the integration of the three calculations. As a new tax governance platform that integrates and innovates based on domestic cloud platforms, the smart tax system must also Comply with the global trend of digitalization of tax collection and management, promote innovation, do our best, overcome difficulties, and show our achievements. Self-innovation in continuous digital upgrades and intelligent transformation will only be demonstrated by modernizing tax governance capabilities and governance levels.

The third is to combine goal orientation and problem orientation to gradually optimize the system technical environment with an open mind.As an industry vertical agent, the smart tax system is currently facing most of the technical and environmental problems that are common problems faced by the general model of artificial intelligence, and need to be solved by relying on higher levels or external forces. However, professional models are faced with insufficient data integration and low quality, lack of data cleaning and mining capabilities and financial resources. They can only proactively face them, focus on solving problems one by one, strengthen horizontal collaboration with an open mind, and seek taxpayers with sincerity and trustworthiness. Compliance with tax-related data improves the quality and quantity of system data.

The fourth is to focus on the goal of improving algorithm governance capabilities and modernize the tax governance system and governance capabilities.There are three stages in technology that reflect the capabilities and level of digital economic governance, namely platform governance, data governance and algorithm governance. Platform governance is the primary stage. This tangible governance model has border governance as the core, allowing people to act within boundaries and strictly handling cross-border situations., the light case is fined, and the serious case is shut down. As an emergency response means, it is simple and effective, but it is easy to fall into the spiral of "death if you manage it, chaos if you release it."

The smart tax system has brought tax governance into the data governance stage, driving the business supervision process with big data analysis conclusions, and preventing diseases before they occur. However, due to the limitations of data integration and digital technology itself, tax data governance is still in the exploratory stage. The new goal is to rise to the stage of algorithmic governance, realize the transformation of tax governance from the perspective of algorithms to intangible governance, and guide taxpayers 'compliance and consciousness with the rationality, advancement and economy of algorithms. This not only requires smart taxation The continuous optimization of the technical environment of the tax system also requires the optimization of the talent structure of the tax system and the advancement of governance capabilities with the times.

(The author Yan Caiming is a researcher at the Institute of Public Policy Governance of Shanghai University of Finance and Economics and a doctorate in economics)