TextMaestro"

2003 TextMaestro Technologies

 

  Examples

  Example 1

  Example 2

  Example 3

  Example 4

  Example 5

  Example 6

  Example 7

  Example 8

  Example 9

  Example 10

  Example 11

  Example 12

  Example 13

  Example 14

  Example 15

  Example 16

  Example 17

  Example 18

  Example 19

  Example 20

  Example 21

 

 

Example 17: Convert Assembly Code to C

Introduction: If you were to ask us, Why did you create TextMaestro?, we would simply reply: to convert Assembly code. All other features here evolved out of this. No other tool out there can handle the hugely repetitive task of rewriting Assembly code into corresponding C code like TextMaestro does. TextMaestro can handle any type of Assembly language, including Alpha,  Intel-8086, Motorola, and MIPS. All you need is a library for each kind.

Why would somebody convert Assembly code?, you might ask. The answer is simple. Vast amounts of critical code have been written in Assembly. There comes a time when porting this code to new hardware or new operating system becomes a necessity and nightmare.

However, developers embarking on such a project will soon find themselves entangled in two formidable problems: 1. They do not know the nuances of the Assembly language well enough, 2. They are not intimately acquainted with the algorithms behind the code. This leads to a state where reverse engineering becomes almost impossible.

Remember, learning Assembly instructions by picking up a book is not difficult. Being able to juggle those instructions and express one's thoughts elegantly takes a lifetime.

With TextMaestro, you have the capability to transliterate legacy Assembly code to C without knowing Assembly in depth and without knowing the algorithms crafted in the code. After you have the complete code in C, you can readily port it to a new platform or environment (just because its in C). Then you can comfortably begin the process of reverse engineering from a position of strength.

With that said, a cautionary note is needed. TextMaestro has no magic formula to perform this feat. You, the user, prepare a library by studying the original code. You will need to put forth a good amount of effort to bring the library up to a working stage.

Developing the library is an iterative process. We provide various libraries in our Repository section (which is under development). When it comes to converting Assembly code, the provided libraries do not guarantee completeness. It is almost certain that you will need to enhance the provided library. Thats where we can provide additional assistance with our customized service. Below we provide a simple step by step example.

Consider the following Alpha assembly code:

        .title  SQUARES2    Table of Squares (OpenVMS)

        $routine    squares2, data_section_pointer=true,-

                     kind=stack, saved_regs=<r2,r15>

        $data_section

sq1::   .blkq   1                   ; To store 1 squared

sq2::   .blkq   1                   ; To store 2 squared

sq3::   .blkq   1                   ; To store 3 squared

                                    ; etc.

        $code_section

        .base   r27,$ls             ; R27 -> linkage section

        ldq     r15,$dp             ; R15 -> data section

        .base   r15,$ds             ; Tell MACRO to use this

first:: mov     1,r1                ; R1 = first difference

        mov     2,r2                ; R2 = second difference

        mov     1,r0                ; R0 = first square

        stq     r0,sq1              ;  to be stored

        addq    r2,r1,r1            ; Adjust first difference

        addq    r1,r0,r0            ; R0 = second square

        stq     r0,sq2              ;  to be stored

        addq    r2,r1,r1            ; Adjust first difference

        addq    r1,r0,r0            ; R0 = third square

        stq     r0,sq3              ;  to be stored

done::  mov     1,r0                ; Signal all is normal

        $return                     ; Return to OpenVMS

        $end_routine squares2       ; Needed by $routine

        .end    squares2            ; Set start address

This code computes squares of 1, 2 and 3 and stores them in three distinct memory locations, sq1, sq2, and sq3. Since we cannot printf from an assembly program, we'll take help of a debugger to review the intermediate values of r0.

Note, above squares2.m64 is for VMS. For Unix, the code appears little different. Here is the screen shot from a VMS debugger console.

 

We want to convert above assembly code to C code.

Step 1: Open TextMaestro. Click on Text Conversion button (see high-lighted below).

 

The following Interactive Text Conversion dialog will appear.

Typically, we define a macro using Find Text and Replace Text in above two text fields and convert Input Text to Output Text in lower two fields.

In addition, you will see this dialog box, where you assign some attributes to a macro.

 

For this example we will need a set of macros. So we use the Library feature of TextMaestro.

Click on Open Library button. The following Text Conversion Library dialog will appear.

The left listbox lists all the libraries defined thus far. The right listbox lists the macros for a particular library. Our plan is to define a set of macros on the right listbox and execute them on our Input Text.

For now delete the macro on the right listbox, and you may rename the current library Find Rep library (sample) on the left listbox to, say Convert Assembly - Alpha. We will show you step by step the making of this library.

(Double click on the item, or click on Modify button to rename a library).

 

Enter the above Assembly code in Input Text area. Before we move on, note that not all instructions will make it to C code; assembly specific instructions will be dropped off at the end of the process. In summary, red highlighted lines will be dropped off, and green ones will be kept.

        .title  SQUARES2    Table of Squares (OpenVMS)

        $routine    squares2, data_section_pointer=true,-

                     kind=stack, saved_regs=<r2,r15>

        $data_section

sq1::   .blkq   1                   ; To store 1 squared

sq2::   .blkq   1                   ; To store 2 squared

sq3::   .blkq   1                   ; To store 3 squared

                                    ; etc.

        $code_section

        .base   r27,$ls             ; R27 -> linkage section

        ldq     r15,$dp             ; R15 -> data section

        .base   r15,$ds             ; Tell MACRO to use this

first:: mov     1,r1                ; R1 = first difference

        mov     2,r2                ; R2 = second difference

        mov     1,r0                ; R0 = first square

        stq     r0,sq1              ;  to be stored

        addq    r2,r1,r1            ; Adjust first difference

        addq    r1,r0,r0            ; R0 = second square

        stq     r0,sq2              ;  to be stored

        addq    r2,r1,r1            ; Adjust first difference

        addq    r1,r0,r0            ; R0 = third square

        stq     r0,sq3              ;  to be stored

done::  mov     1,r0                ; Signal all is normal

        $return                     ; Return to OpenVMS

        $end_routine squares2       ; Needed by $routine

        .end    squares2            ; Set start address

 

 

That suggests, we will need macros to handle mov, stq, and addq instructions. Below is a simple explanation of these Alpha instructions:

 

Instruction

Example

What it means

Corresponding C code

mov

mov  2, r2

Move 2 to register 2

r2 = 2;

stq

stq  r0, sq1

Stores contents of register 0 in a predefined memory location pointed by ptr_sq1

sq1 = r0;

Also, ptr_sq1 = &r0 (if char bytes)

addq

addq  r2, r1, r1

Add contents of register 2 and 1 and put the result in register 1.

r1 = r2 + r1;

 

Let us dissect how to define a macro for mov. Use the following:

 

Set Attributes:

(1) Use wild-card in Find

(2) Select P to enable Per-line search

 

Next, define Find Text and Replace Text as shown below.

Find Text

Replace Text

*mov^b*,*^b;*

[+]   <*3> = <*2>; <a=37>//<*4>

 

Perhaps it is the most helpful to view above macro in a picture.

At this stage click on Convert button, and you will see the effect of this macro, as shown below:

Now that the mov macro works, hold down Ctrl key and click on Library button to save the macro to the Library.

We still need to add two more macros for stq and addq, respectively. There are two ways you can do that - 1. the way we described above, 2. Using Add button, Duplicate button, Modify button on the right hand side of the Library dialog. Assuming we've defined all the macros, (as shown below), we click on the Convert button from the Library dialog.

 

 

You may wonder, why is there "[+]"? Recall, we want to keep only the translated text. "[+]" is just a flag to distinguish between the two kinds. We may choose to add a fourth macro to filter the desired text in.

In text form, the library is the following:

Knobs

Find Text

Replace Text

P-x-x-x

*mov^b*,*^b;*

[+]   __<*3> = __<*2>; <a=37>//<*4>

P-x-x-x

*stq^b*,*^b;*

[+]   __<*3> = __<*2>; <a=37>//<*4>

P-x-x-x

*addq^b*,*,*^b;*

[+]   __<*4> = __<*2> + <*3>; <a=37>//<*5>

P-x-F-x

[+]*

*

It is above 'F' knob that appears on Attributes dialog as you see here, that will keep the final desired text.

 

   r1 = 1;                       // R1 = first difference

   r2 = 2;                       // R2 = second difference

   r0 = 1;                       // R0 = first square

   sq1 = r0;                     //  to be stored

   r1 = r2 + r1;                 // Adjust first difference

   r0 = r1 + r0;                 // R0 = second square

   sq2 = r0;                     //  to be stored

   r1 = r2 + r1;                 // Adjust first difference

   r0 = r1 + r0;                 // R0 = third square

   sq3 = r0;                     //  to be stored

   r0 = 1;                       // Signal all is normal

The clean output will then look like above.

 

You may have noticed that the last line on above text is a "false-positive". There are ways you can filter them out such as by adding flag-oriented macros. For now, we simply delete it and take the rest of the text to write a C function manually. Here is the text.

#include <stdio.h>

 

static my_sqr_exp()

{

   int r0, r1, r2;

   int sq1, sq2, sq3;

 

   r1 = 1;                       // R1 = first difference

   r2 = 2;                       // R2 = second difference

   r0 = 1;                       // R0 = first square

   sq1 = r0;                     //  to be stored

   r1 = r2 + r1;                 // Adjust first difference

   r0 = r1 + r0;                 // R0 = second square

   sq2 = r0;                     //  to be stored

   r1 = r2 + r1;                 // Adjust first difference

   r0 = r1 + r0;                 // R0 = third square

   sq3 = r0;                     //  to be stored

 

   printf("  sqr(1) = %d\n", sq1);

   printf("  sqr(2) = %d\n", sq2);

   printf("  sqr(3) = %d\n", sq3);

}

 

void main()

{

   my_sqr_exp();

}

 

Here is a sample execution:

mytmt01<+> cc test.c

mytmt01<+> a.out

  sqr(1) = 1

  sqr(2) = 4

  sqr(3) = 9

mytmt01<+>

 

 

 

QA:

  • How can I get a pre-fabricated library?

You'll need to purchase one. We gave away know-how above. You'll need to pay for the labor to create the library.

 

  • How much will it cost?

It depends on the complexity. If it is trivial, it's likely to be free.

 

  • Is final output error free?

No. It needs rigorous scrutiny from your part. We take no responsibility of incomplete translation.

 

  • Can this procedure be applied to a set of files?

Yes. Use Batch mode.

 

 

Top