Static Analysis of The DeepSeek Android App
I conducted a fixed analysis of DeepSeek, a Chinese LLM chatbot, utilizing version 1.8.0 from the Google Play Store. The objective was to identify possible security and personal privacy issues.
I have actually composed about DeepSeek previously here.
Additional security and privacy issues about DeepSeek have actually been raised.
See also this analysis by NowSecure of the iPhone version of DeepSeek
The findings detailed in this report are based purely on fixed analysis. This implies that while the code exists within the app, there is no definitive evidence that all of it is executed in practice. Nonetheless, the existence of such code warrants analysis, especially offered the growing concerns around data personal privacy, monitoring, the potential abuse of AI-driven applications, and cyber-espionage dynamics in between international powers.
Key Findings
Suspicious Data Handling & Exfiltration
- Hardcoded URLs direct information to external servers, raising issues about user activity monitoring, such as to ByteDance "volce.com" endpoints. NowSecure identifies these in the iPhone app yesterday as well.
- Bespoke encryption and information obfuscation techniques exist, with signs that they might be used to exfiltrate user details.
- The app contains hard-coded public keys, rather than relying on the user device's chain of trust.
- UI interaction tracking captures detailed user behavior without clear approval.
- WebView manipulation exists, which might permit the app to gain access to private external internet browser data when links are opened. More details about WebView manipulations is here
Device Fingerprinting & Tracking
A substantial part of the examined code appears to focus on gathering device-specific details, which can be utilized for tracking and fingerprinting.
- The app gathers various distinct gadget identifiers, consisting of UDID, Android ID, IMEI, IMSI, and carrier details. - System homes, set up plans, and root detection mechanisms suggest potential anti-tampering measures. E.g. probes for the presence of Magisk, a tool that personal privacy supporters and security researchers utilize to root their Android gadgets.
- Geolocation and network profiling exist, showing prospective tracking capabilities and enabling or disabling of fingerprinting programs by region. - Hardcoded gadget model lists suggest the application may behave differently depending upon the discovered hardware.
-
Multiple vendor-specific services are used to extract extra gadget details. E.g. if it can not identify the device through standard Android SIM lookup (because permission was not approved), it attempts manufacturer particular to access the exact same details.
Potential Malware-Like Behavior
While no definitive conclusions can be drawn without dynamic analysis, a number of observed behaviors line up with recognized spyware and malware patterns:
- The app utilizes reflection and UI overlays, which might help with unapproved screen capture or phishing attacks. - SIM card details, identification numbers, elearnportal.science and other device-specific information are aggregated for unknown functions.
- The app carries out country-based gain access to constraints and "risk-device" detection, suggesting possible monitoring mechanisms.
- The app executes calls to load Dex modules, where extra code is loaded from files with a.so extension at runtime.
- The.so submits themselves reverse and make additional calls to dlopen(), which can be utilized to fill additional.so files. This center is not usually checked by Google Play Protect and other fixed analysis services.
- The.so files can be carried out in native code, such as C++. Using native code adds a layer of intricacy to the analysis procedure and obscures the full level of the app's capabilities. Moreover, native code can be leveraged to more quickly intensify opportunities, potentially making use of vulnerabilities within the os or device hardware.
Remarks
While information collection prevails in modern-day applications for debugging and improving user experience, aggressive fingerprinting raises significant personal privacy concerns. The DeepSeek app requires users to visit with a valid email, which should already provide enough authentication. There is no legitimate factor for the app to strongly collect and transfer unique gadget identifiers, IMEI numbers, SIM card details, and other non-resettable system homes.
The level of tracking observed here surpasses typical analytics practices, possibly allowing relentless user tracking and re-identification throughout devices. These behaviors, integrated with obfuscation methods and network communication with third-party tracking services, warrant a greater level of examination from security researchers and users alike.
The employment of runtime code filling along with the bundling of native code suggests that the app might permit the release and execution of unreviewed, remotely delivered code. This is a major prospective attack vector. No proof in this report is presented that from another location released code execution is being done, just that the facility for this appears present.
Additionally, the app's method to discovering rooted devices appears extreme for an AI chatbot. Root detection is often warranted in DRM-protected streaming services, where security and material protection are vital, or in competitive video games to prevent unfaithful. However, there is no clear rationale for such strict steps in an application of this nature, raising more concerns about its intent.
Users and companies considering setting up DeepSeek must understand these prospective threats. If this application is being used within an enterprise or government environment, extra vetting and security controls should be implemented before permitting its release on handled devices.
Disclaimer: The analysis presented in this report is based upon static code review and does not suggest that all found functions are actively used. Further investigation is required for definitive conclusions.