Perspective | China’s First Case Involving Big Data on Public Opinion: Judicial Insights and Industry Guidance on the Compliance of Data Scraping


Published:

2025-12-05

On November 25, 2025, the Fujian Provincial Higher People's Court issued a final judgment in the unfair competition dispute between Wangzhi Tianyuan Technology Group Co., Ltd. (hereinafter referred to as “Wangzhi Company”) and Tencent. The court upheld the first-instance ruling ordering Wangzhi Company to pay compensation of 2.28 million yuan. This judgment, dubbed the "First Case of China’s Public Opinion Big Data," for the first time clearly defined the judicial boundaries for the protection of rights related to big data collections, providing important guidance for determining the compliance of data scraping activities and having a profound impact on the standardized development of the entire big data industry.

On November 25, 2025, the Fujian Provincial Higher People's Court issued a final judgment in the unfair competition dispute between Wangzhi Tianyuan Technology Group Co., Ltd. (hereinafter referred to as “Wangzhi Company”) and Tencent. The court upheld the first-instance ruling ordering Wangzhi Company to pay compensation of 2.28 million yuan. This judgment, dubbed “China’s First Case on Public Opinion Big Data,” for the first time clearly defined the judicial boundaries for the protection of rights related to big data collections, providing important guidance for determining the compliance of data scraping activities and having a profound impact on the standardized development of the entire big data industry.


 

I. From Single Data to Collective Value: The Judicial Paradigm Shift in Big Data Rights Protection


 

The central dispute in this case revolves around whether Tencent enjoys legally protected rights over its collection of news data. For a long time, the legal nature of data rights has remained ambiguous in judicial practice, and particularly in the protection of massive data sets, there have been significant gaps in the absence of clear rules. In its judgment, the Fujian High People’s Court innovatively constructed a threefold argumentative framework—“Investment-Value-Protection”—providing a resolution to this issue.


 

First is “ Substantial investment “Tencent does not simply collect fragmented data; rather, through long-term investments in copyright acquisition, technological R&D, operations, and promotion, it has built a structured data collection containing a massive volume of news. This investment is not only reflected at the financial level but also involves the integration of resources across multiple dimensions, including content curation, algorithm optimization, and user interaction. Next is…” Independent value “This data set, thanks to its scale effect, has generated competitive value that goes beyond that of a single news item—it is not merely a mere accumulation of information; rather, it has become a core resource for enterprises to analyze user preferences, optimize content distribution, and enhance their market competitiveness. Finally, there is ‘legal protection.’ Based on Article 2 of the Anti-Unfair Competition Law, which stipulates the principles of ‘voluntariness, equality, fairness, and integrity’ as well as business ethics, the court held that competitively accumulated resources obtained through lawful operations should be afforded comprehensive legal protection. This judicial logic breaks through the traditional perception that ‘data is devoid of rights,’ upgrading the scope of protection from individual data items to data sets possessing independent value, thereby providing a new approach to legally establishing ownership of corporate data assets.”


 

II. Technological Neutrality Does Not Equate to Legality of Conduct: The Triple Criteria for Determining the Illegality of Data Scraping


 

Wangzhi Company had previously argued that its data scraping activities were lawful on the grounds of “compliance with the robots protocol,” but this defense was not accepted by the court. The court, through... Protocol Compliance The criteria for determining the illegality of data scraping have been established based on three dimensions: “legal compliance, behavioral rationality, and business ethics.”


 

First, the compliance review of the agreement. The crawler protocol is essentially a technical access rule that authorizes only “technical scraping” and does not cover “commercial use.” After using its “Zhan Ying” public opinion analysis software to scrape data from Tencent News, Wangzhi Company then employed this data in its paid public opinion analysis products, clearly exceeding the scope authorized by the agreement. Second, Behavioral rationality Judgment: The court introduced the “principle of minimal necessity,” requiring that data scraping be limited to the minimum scope necessary to achieve the intended functionality. However, Wangzhi Company failed to implement the requisite technical restrictions—such as setting scraping frequency and data volume thresholds—resulting in uncontrolled scaling of the scraping activities and further exacerbating the harm to Tencent’s data rights. Thirdly, Business ethics Assessment. The court pointed out that the essence of Wangzhi Company’s conduct is “free-riding”—directly profiting from data sets created through others’ resource investments, which not only harms Tencent’s competitive interests but also violates the principle of honesty and good faith stipulated in Article 2 of the Anti-Unfair Competition Law. This ruling makes it clear: technological neutrality cannot serve as a “shield” for illegal behavior; the legality of data scraping activities must be assessed comprehensively across multiple dimensions, including agreements, conduct, and ethics.


 

III. From “Vague Estimation” to “Precise Calculation”: The Refinement of Infringement Compensation Practices


 

Another key highlight of this case is the method used to calculate damages. In the first-instance court, the traditional “single-dimensional calculation” approach was broken with the adoption of a composite calculation method that takes into account both “losses and profits.” The second-instance court upheld this decision. Specifically:

Direct loss: The base compensation amount shall be determined based on the duration of the infringement (Netwise has been continuously collecting data for a long period) and the scope of data usage (covering Tencent News’ core content).

Infringement profits: Based on objective data such as the number of users, pricing standards, and recharge records of the “Zhan Ying” product, calculate the illegal profits obtained by Wangzhi Company due to the infringement.

Punitive factors: Taking into account Wang Zhi Company’s intentional misconduct (continuously engaging in use beyond the authorized scope despite being fully aware of it) and the repeated nature of its infringing acts (having previously received warnings for similar behavior), the final compensation amount was set at 2.28 million yuan.


 

This calculation method—balancing loss and gain, and combining subjective malice with objective harm—both compensates the rights holder for their actual losses and effectively deters the infringer, providing a replicable judicial template for damages calculations in similar cases.


 

IV. From Passive Response to Proactive Governance: The Compliance Transformation Path for the Big Data Industry


 

The significance of this case extends far beyond the specific judicial ruling—it also points out a compliance direction for the entire big data industry. By combining the key points of the judgment with industry practices, enterprises can build a compliance system for data collection from the following dimensions:


 

1. Establish a three-tiered prevention and control mechanism comprising “protocol review – technical restrictions – usage monitoring.”

Protocol Review: Develop an intelligent crawler protocol parsing system that automatically identifies the scope of authorization granted by data source providers (e.g., whether commercial use is permitted, limitations on crawling frequency, etc.).

Technical limitations: Set strict technical parameters such as crawling frequency and data volume to prevent over-range crawling caused by loss of technical control.

Monitoring Usage: Establish a whitelist system for data usage scenarios, clearly stipulating that data can only be used for non-competitive purposes (such as internal research) and prohibiting its direct use in commercial products.


 

2. Improve compliance auditing and traceability mechanisms.

Conduct special audits on data compliance every quarter, with a particular focus on verifying the legality of data sources and the compliance of data usage scopes.

Introduce a third-party certification body to conduct compliance assessments and enhance the credibility of audits.

Establish a comprehensive data scraping log system (including time, scope, purpose, etc.) to ensure full-process traceability and preserve evidence for potential legal disputes.


 

3. Promote alignment of industry standards with international practices.

Participate in the development of industry standards such as the “Data Scraping Guidelines,” to standardize the technical, commercial, and legal boundaries of data scraping.

Join the industry compliance alliance and share compliance experiences (such as sensitive data encryption and classified data management).

Track developments in judicial precedents and, in conjunction with legal provisions such as the Data Security Law and the Personal Information Protection Law, dynamically adjust compliance strategies.


 

V. Conclusion: Striking a Balance Between Innovation and Order


 

The verdict in the “Warhawk” case marks a crucial step forward for China’s judicial system in the field of big data protection. By adopting a three-stage judicial logic—“identification of rights and interests—qualification of conduct—assumption of liability”—the court has not only safeguarded the legitimate rights and interests of data originators but also established reasonable boundaries for data circulation. As the Data Security Law and the Personal Information Protection Law are further implemented, big data protection will evolve into a three-dimensional governance framework characterized by “legal regulations + industry self-regulation + technological safeguards.”


 

For enterprises, data is a core competitive advantage, but compliance is the lifeline of development. Only by striking a balance between technological innovation and legal compliance and by clearly defining the boundaries between data circulation and the protection of rights and interests can we truly promote the healthy and sustainable development of the big data industry.

News source:

https://mp.weixin.qq.com/s/XK8G4lgX0wg3ElFTH4bQ6Q


 

Key words:


Related News


Address: Floor 55-57, Jinan China Resources Center, 11111 Jingshi Road, Lixia District, Jinan City, Shandong Province