Tech Brief

Obfuscation: Protecting the Source

How to protect your source code from reverse engineering.

One of the consequences of developing applications with managed code such as .NET or Java is that source code extraction (reverse engineering) becomes a very simple task. Today more than 500,000 developers have installed reverse engineering utilities. Reverse engineering is a common practice for support, education and debugging of .NET applications.

However, unmanaged access to application source code can also pose material risks, including Intellectual Property (IP) theft, application vulnerability exposure and software piracy. For those organizations where these risks must be managed, there is really only one option-obfuscation.

What Is Obfuscation?
At its core, obfuscation is defined as a collection of transformations that are applied to compiled applications (DLLs in the case of .NET) that make reverse engineering materially more difficult for people and machines but do not alter the behavior of the obfuscated application. However, the current climate of heightened emphasis on application security, compliance and development best practices has rendered this definition inadequate. A complete definition has three dimensions: obfuscation as a technology, a development process and an IT control.

From a technology standpoint, obfuscation transformations fall into a number of categories:

  • Renaming. Altering the names of methods, variables, etc. to make source code more difficult to understand. Strong renaming algorithms use overloading to reuse names, forcing every line to be analyzed.
  • Control flow obfuscation. Logic and flow are re-expressed, making translation into valid C# (or any other language) impossible. Sophisticated approaches provide different levels to strike the right balance between obfuscation and performance.
  • String encryption. Strings such as login prompts, SQL queries, etc. are encrypted, and decryption function calls are injected into the instruction stack before the string is needed.
  • Other techniques. Numerous other methods such as metadata stripping and application watermarking raise the bar for reverse engineering above what is required to reverse engineer native code (such as C or Cobol).

    Without a well-defined and integrated obfuscation process, the complexities and risks introduced by obfuscation may ultimately outweigh the promised benefits. Obfuscation can complicate debugging, patch generation and management, distributed development practices and the reuse of libraries, components and Web services.

    These challenges can be mitigated. Tight integration with development platforms (such as Visual Studio), the inclusion of tools and utilities that can unwind and/or reuse obfuscation transformations and the integration with operations management platforms (such as Microsoft's Operations Manager) can all address potentially costly side effects.

    Obfuscation Development Process
    Figure 1 demonstrates a modern approach to enterprise obfuscation. The developer best practice is to have developers indicate where obfuscation transformations may or may not be appropriate.

    Figure 1. A modern approach to enterprise obfuscation.
    [click image for larger view]
    Figure 1. A modern approach to enterprise obfuscation.

    Obfuscation is applied after the build and should also include additional services such as compaction (stripping of unused code to reduce size) and linking (combining multiple DLLs into one to simplify distribution). Further, all transformations should be captured (and previous transformations reused as appropriate) to support the many development scenarios outlined above.

    Beyond the Technology
    IT controls are documented obfuscation processes, which include the specific risks being managed, the processes and training programs that ensure obfuscation is applied appropriately and consistently used.

    The definition of obfuscation is evolving from strictly a "preventative control" to a "detective control." Obfuscation can enable applications to self-diagnose when tampering has occurred and alert operations management or managed service platforms. It's an ideal vector to introduce application monitoring for both security and performance and can provide the following functionality:

  • Instrument applications utilizing .NET attribute values set by developers.
  • Agentless monitoring by linking runtime DLLs into the application.
  • Minimizing of any change in the application footprint utilizing existing compaction functionality.

    Reverse engineering is widely used and obfuscation is the accepted best practice to mitigate risks. Auditors expect any organization developing with managed code to have a position on the risks that may be introduced. In a recent survey that included over 300 respondents, 75 percent indicated that if their senior management understood these risks, they would modify their internal controls to ensure that the use of obfuscation is appropriate and consistent across all of their development projects.

    To be clear, obfuscation may be appropriate for 1 percent or 80 percent of an organization's development projects. However, an organization that does not yet have a policy on obfuscation is opening itself up for potentially damaging IT audit findings as well as potential material damages through lost IP, increased application vulnerability or exploitation, or lost revenue through software piracy.

    About the Author

    Sebastian Holst is chief marketing officer at PreEmptive Solutions LLC and served as a member of the W3C Advisory Committee for three years, a board member of IDEAlliance for four years and was a cofounder of the Compliance Consortium.

    comments powered by Disqus

    Featured

    • AI for GitHub Collaboration? Maybe Not So Much

      No doubt GitHub Copilot has been a boon for developers, but AI might not be the best tool for collaboration, according to developers weighing in on a recent social media post from the GitHub team.

    • Visual Studio 2022 Getting VS Code 'Command Palette' Equivalent

      As any Visual Studio Code user knows, the editor's command palette is a powerful tool for getting things done quickly, without having to navigate through menus and dialogs. Now, we learn how an equivalent is coming for Microsoft's flagship Visual Studio IDE, invoked by the same familiar Ctrl+Shift+P keyboard shortcut.

    • .NET 9 Preview 3: 'I've Been Waiting 9 Years for This API!'

      Microsoft's third preview of .NET 9 sees a lot of minor tweaks and fixes with no earth-shaking new functionality, but little things can be important to individual developers.

    • Data Anomaly Detection Using a Neural Autoencoder with C#

      Dr. James McCaffrey of Microsoft Research tackles the process of examining a set of source data to find data items that are different in some way from the majority of the source items.

    • What's New for Python, Java in Visual Studio Code

      Microsoft announced March 2024 updates to its Python and Java extensions for Visual Studio Code, the open source-based, cross-platform code editor that has repeatedly been named the No. 1 tool in major development surveys.

    Subscribe on YouTube