The issues with code obfuscation
Code obfuscation is the earliest solution applied to Java code protection, and it is also the most direct solution.
Code obfuscation typically involves the following four methods:
- Package name, class name, variable name conversion
- Control structures change, such as control flow flattening, adding immutable predicates, etc.
- String obfuscation or encryption
- Add useless code
Code obfuscation can significantly reduce the readability of decompiled code and increase the difficulty of static analysis, but no matter how code obfuscation is performed, the program's execution logic will not be altered.
JVM bytecode is a very clear and explicit semantic intermediate code that is highly readable. For obfuscated class files, even if they cannot be restored to readable Java source code, they can still be analyzed at the bytecode level. Due to the high semantic nature of Java bytecode, this process is actually relatively easy.
We have developed a JVM bytecode execution engine using Java and Kotlin languages. Users can use this project to dynamically debug Java programs at the bytecode level in IntelliJ IDEA. For more information, please refer to the article below.
https://protector4j.com/articles/jvm-bytecode-engine-written-with-java-and-kotlin/
And we use this engine to attempt to crack a well-known obfuscated code, the specific process can be referred to in the following article
http://protector4j.com/articles/deobfuscate-with-vlx-vmengine/
Conclusion
From the above analysis, it can be seen that due to the high semanticity of JVM bytecode, it is very easy to be analyzed and read. The running logic can be easily analyzed through dynamic debugging. Writing dynamic debugging tools is not a very complex task, so obfuscation is not a reliable protection solution.