Beak
Version 1.2

The small C and C++ Obfuscator.
Beak is a small tool that can be used to lose weight of your source code, and generate obfuscated C and C++ source code by default. It was originally developed to reduce binary size of software running in embedded systems. All user defined symbols can be replaced by very short tokens results that the compiled final binary will be smaller than its normal compiling. Beak is very small and the obfuscator itself was written with ANSI C (C89) from scratch, cross-platform, operation system and processor independent.

General features

Result

Support Languages

Beak was designed as a C and C++ Obfuscator, and for the programming languages that suitable be used in embedded systems including:
Ada, Lisp, Lua, Forth, Tcl, Basic, Erlang, Ruby, Rust, Python, JavaScript, Java, C#.

Beak can also deal with the following common programming languages, and be used to obfuscate source code files but not for binary size reducing in most situations:
Asm, Asp, AWK, CMake, COBOL, Cuda, D, DosBatch, Eiffel, Fortran, F#, Go, Html, Matlab, Objective-C, OCaml, PHP, Pascal, Perl, Perl6, PostScript, Prolog, R, Rexx, Rst, SQL, Scheme, Shell, Slang, SystemVerilog, TypeScript, VHDL, Vera, Verilog.

Screenshots:

Convert Source Code

  1. Drag your project folder into Beak from Finder or open your souce code folder from menu "Open..." or "Open Recent".
  2. Click "Target Folder" button to select a target directory, the output files will be placed there. E.g:
    Target directory: /Users/username/Desktop
    Output directory: /Users/username/Desktop/projectname-lt (lt means lite)
  3. Select one of the programming languages from the toolbar for your source code.
  4. (Optional) click menu "Append Symbols": open a palin text file that contains symbold difinitions these will be appened for replacing.
  5. (Optional) click menu "Exluded Symbols": open a palin text file that contains symbold difinitions these will be exluded for replacing.
  6. Click the "Start" button on the toolbar to start the converting process.
  7. When the process completed, all replaced symbols will be listed in tokens.txt under the target directory.

Extranal Symbol Files:
On Step 4 and Step 5, the file content written line by line, each line holding one symbol, and line starts with "#" will be ignored (comment line).

Checking source code: Sometime the generated C or C++ code can't be directly compiled after above steps, e.g some symbols defined in Makefile required to replace with the short tokens generated in tokens.txt, and some C and C++ macros can't be found, so in such situation, these macro names should be replaced with the short tokens in tokens.txt manually when complier say they're "undeclared identifier".

Add a new file type for symbol parser

Open "/Applications/Beak.app/Contents/Resources/FileExtensions.plist", and carefully add the file extension name in it, e.g let Beak parse all .mli as Ocaml source code file:

  <key>F# - OCaml</key>
  <array>
  <string>ml</string>
  <string>mli</string>
  </array>

By default, Beak only parse file types defined in FileExtensions.plist by file extensions for the selected programming language, you can append more file types for a language by this approach.

C and C++ Standards

C: C89, C90, C99, C11.
C++: C++98, C++11, C++14, C++17, C++20.

Inside the C and C++ Obfuscator

Step 1: generate all user defined symbols in the source code directory.
Step 2: generate shorter tokens for the user defined symbols.
Step 3: replace user defined symbols with the generated shorter tokens in source code files.
Step 4: report all replaced tokens.

Built-in token database

It's seems very simple at the first glance of the obfuscator, but it's inside is complex actually. When scan source code files, Beak must detect the parsed token is whether a keeping keyword (token) or not, and ignore it by query a built-in token database, the database was created by very time-consuming works that described as the following sections.

Content of the built-in token database:
The built-in token database of Beak includes several part of tokens generated from different sources.
1: C and C++ keywords for the language standards: e.g. auto, break, case, const, default, do, while, else ...
2: C and C++ preprocessor keywords:
such as __cplusplus, DEBUG, NDEBUG, RELEASE, __MACH__, __x86_64, __PIC__, __SSE2__, __weak__attribute__, __GNUC__, __VERSION__, __INT_MAX__, _M_IX86, __INT16_MAX__, __APPLE__, __clang__, __OBJC__, __FILE__, __LINE__, __FUNCTION__, __DATE__, __TIME__, ...
3: Standard library functions, including stdlib, glib, libcxx, openmp, libunwind, STL, Boost etc:
INT_MAX, INT_MIN, int32_t, stderr , atan, ceilf, atol, feof, fflush, fgetc, gets, ftell, memset, rand, qsort, strtok, vsprintf , tmpnam, wcscpy, time, tm, ...
4: Compilers' tokens.
5: System tokens used by operation system.

Short tokens generation algorithm

Beak uses a very special algorithm to generate very short token to replace user defined symbols, the algorithm referred URL shorting algorithm but these algorithms were not adopted directly.

Example

Here, minimising the compiled size of SQLite as an example:
SQLite version: 3.20.1
Executable: 64-bit x86_64
Platform: macOS Sierra

VersionSource sizeComplied sizeNote
19.3 MB1.4 MBoriginal version, shell executable without TCL extension.
21.7 MB741 KBwith C API wrapper, without TCL extension, without FTS, rtree, rbu, icu, session.
31.6 MB651 KBoptimised version by Beak, with full functional features, C API wrapper plus compressed BLOB, and SEE (SQLite Encryption Extension) support.

Excellent software should be elegant, reliable and small. Especially, building small software is very important for deeply-embedded systems. Lovingly crafting software as elegant and small as possible is the real programmer's cultivation.







If you can't open the link above, please launch App Store and found Beak by searching.


Advanced Usage

Please drop us a message if your software required high security or your source code needs to be obfuscated, or you requires Beak to support a new programming language.