Static Analysis of The DeepSeek Android App
I conducted a fixed analysis of DeepSeek, a Chinese LLM chatbot, utilizing variation 1.8.0 from the Google Play Store. The goal was to identify possible security and personal privacy issues.
I've discussed DeepSeek formerly here.
Additional security and personal privacy issues about DeepSeek have been raised.
See also this analysis by NowSecure of the iPhone version of DeepSeek
The findings detailed in this report are based simply on static analysis. This implies that while the code exists within the app, there is no conclusive evidence that all of it is carried out in practice. Nonetheless, the existence of such code warrants scrutiny, especially given the growing issues around information privacy, security, the potential misuse of AI-driven applications, wiki.fablabbcn.org and cyber-espionage characteristics in between global powers.
Key Findings
Suspicious Data Handling & Exfiltration
- Hardcoded URLs direct data to external servers, raising issues about user activity tracking, such as to ByteDance "volce.com" endpoints. NowSecure identifies these in the iPhone app yesterday too.
- Bespoke encryption and information obfuscation approaches are present, with indicators that they could be used to exfiltrate user details.
- The app contains hard-coded public keys, rather than counting on the user device's chain of trust.
- UI interaction tracking captures detailed user habits without clear authorization.
- WebView adjustment is present, which might allow for the app to gain access to private external internet browser data when links are opened. More details about WebView manipulations is here
Device Fingerprinting & Tracking
A considerable part of the examined code appears to concentrate on event device-specific details, which can be utilized for tracking and fingerprinting.
- The app collects numerous special device identifiers, including UDID, Android ID, IMEI, IMSI, and carrier details. - System homes, set up packages, and root detection mechanisms suggest possible anti-tampering procedures. E.g. probes for the presence of Magisk, a tool that personal privacy advocates and security researchers use to root their Android gadgets.
- Geolocation and network profiling are present, showing potential tracking abilities and making it possible for or disabling of fingerprinting programs by area.
- Hardcoded gadget design lists recommend the application may act differently depending on the identified hardware.
- Multiple vendor-specific services are used to draw out additional device details. E.g. if it can not figure out the gadget through basic Android SIM lookup (because permission was not given), it tries producer specific extensions to access the same details.
Potential Malware-Like Behavior
While no definitive conclusions can be drawn without dynamic analysis, several observed behaviors line up with known spyware and malware patterns:
- The app utilizes reflection and UI overlays, which might help with unapproved screen capture or phishing attacks. - SIM card details, serial numbers, and other device-specific data are aggregated for unknown functions.
- The app executes country-based gain access to constraints and "risk-device" detection, suggesting possible surveillance mechanisms.
- The app carries out calls to pack Dex modules, where additional code is loaded from files with a.so extension at runtime.
- The.so files themselves reverse and make additional calls to dlopen(), which can be used to load additional.so files. This center is not generally inspected by Google Play Protect and other fixed analysis services.
- The.so files can be carried out in native code, such as C++. Using native code adds a layer of complexity to the analysis procedure and obscures the full level of the app's capabilities. Moreover, native code can be leveraged to more quickly intensify opportunities, potentially exploiting vulnerabilities within the operating system or gadget hardware.
Remarks
While data collection prevails in modern applications for debugging and improving user experience, aggressive fingerprinting raises considerable privacy concerns. The DeepSeek app requires users to log in with a legitimate email, which should currently provide sufficient authentication. There is no valid factor for the app to aggressively collect and send unique device identifiers, IMEI numbers, details, and other non-resettable system residential or commercial properties.
The level of tracking observed here goes beyond typical analytics practices, possibly allowing persistent user tracking and re-identification throughout gadgets. These habits, combined with obfuscation strategies and network communication with third-party tracking services, require a greater level of scrutiny from security researchers and users alike.
The employment of runtime code filling in addition to the bundling of native code recommends that the app could allow the deployment and execution of unreviewed, from another location provided code. This is a serious potential attack vector. No evidence in this report is provided that remotely deployed code execution is being done, only that the facility for this appears present.
Additionally, the app's approach to identifying rooted devices appears extreme for an AI chatbot. Root detection is often warranted in DRM-protected streaming services, where security and content security are crucial, or in competitive video games to prevent unfaithful. However, there is no clear reasoning for such strict steps in an application of this nature, raising more questions about its intent.
Users and companies thinking about setting up DeepSeek should know these potential threats. If this application is being utilized within an enterprise or federal government environment, extra vetting and security controls ought to be imposed before enabling its implementation on handled gadgets.
Disclaimer: The analysis provided in this report is based on static code review and does not suggest that all discovered functions are actively utilized. Further investigation is required for definitive conclusions.