After tcc_compile_string, is there a way to get the length of the code region?











up vote
0
down vote

favorite












For exploration and demonstration, tried mixing a little tcc and udis86. This is from GnuCOBOL, so there is no real access to C headers, or members of the TCCState struct, but it can be done with a little bit of manual work and/or text preprocessing to get proper widths and offsets.



tcc built with --disable-static, to build a shared library. udis86 right out of the Fedora repos.



  *> tcc-udis  tcc as libtcc.so, and udis86 for some disassembly
*> Tectonics:
*> cobc -xj -g tcc-udis.cob -ltcc -ludis86
*>
>>SOURCE FORMAT IS FREE
identification division.
program-id. sample.

environment division.
configuration section.
repository.
function all intrinsic.

REPLACE ==newline== BY ==& x"0a" &==.

data division.
working-storage section.

01 TCC-OUTPUT-MEMORY constant as 1.
01 TCC-RELOCATE-AUTO usage pointer.

01 tcc usage pointer.
01 rc usage binary-long.
01 prog-entry usage program-pointer.

01 udis pic x(632). *> sizeof(ud_t)
01 ud-translate usage program-pointer. *> AT&T or INTEL
01 code-size usage binary-long.
01 formatted usage pointer.
01 running-offset pic 9999.
01 spacer pic x(32). *> max 16 byte instruction

01 prog. 05 value
"#include <stdio.h>" newline
"int hello() { printf(""%s"", ""Hello, tccn""); }"
.

procedure division.

call "tcc_new" returning tcc
if tcc equal null then
display "error: tcc_new failed" upon syserr
goback
end-if

call "tcc_set_output_type" using by value tcc TCC-OUTPUT-MEMORY
returning rc
if rc not equal zero then
display "error: tcc_set_output_type " rc upon syserr
goback
end-if

call "tcc_compile_string" using by value tcc by reference prog
returning rc
if rc not equal zero then
display "error: tcc_compile_string " rc upon syserr
goback
end-if

*> in C this is set to (void*)1
set TCC-RELOCATE-AUTO up by 1
call "tcc_relocate" using by value tcc
by value TCC-RELOCATE-AUTO
returning rc
if rc not equal zero then
display "error: tcc_relocate " rc upon syserr
goback
end-if

call "tcc_get_symbol" using by value tcc by reference "hello"
returning prog-entry
if prog-entry equal null then
display "error: tcc_get_symbol hello " upon syserr
goback
end-if

call prog-entry

move 43 to code-size
perform disassemble
display space

set prog-entry to entry "cob_embed_python"
move 23 to code-size
perform disassemble

call "tcc_delete" using by value tcc returning omitted
goback.

*> take a look at some disassembly
disassemble.
call "ud_init" using udis
call "ud_set_mode" using udis by value 64 *> 64bit
call "ud_set_vendor" using udis by value 2 *> Any
call "ud_set_input_buffer" using udis value prog-entry code-size

set ud-translate to entry "ud_translate_att"
call "ud_set_syntax" using udis by value ud-translate

move 0 to running-offset
call "ud_disassemble" using udis returning rc
perform until rc equal zero

call "ud_insn_hex" using udis returning formatted
display running-offset space content-of(formatted)
spacer(1:32 - rc * 2) with no advancing

add rc to running-offset

call "ud_insn_asm" using udis returning formatted
display space content-of(formatted)

call "ud_disassemble" using udis returning rc
end-perform
.
end program sample.


Udis ud_set_input_buffer wants a size. It would be nice to be able to use an exact value, as determined by tcc and TCC_OUTPUT_MEMORY.



Works fairly well for exploring, but the code-size used in the sample is just guesses followed by runs followed by counting followed by source changes and more runs. The length was purposefully short to cut off an instruction in the second disassemble, as a demonstration.



prompt$ cobc -xj -g tcc-udis.cob -ltcc -ludis86
Hello, tcc
0000 55 push %rbp
0001 4889e5 mov %rsp, %rbp
0004 4881ec00000000 sub $0x0, %rsp
0011 488d0571100000 lea 0x1071(%rip), %rax
0018 4889c6 mov %rax, %rsi
0021 488d0564100000 lea 0x1064(%rip), %rax
0028 4889c7 mov %rax, %rdi
0031 b800000000 mov $0x0, %eax
0036 e817000000 call 0x40
0041 c9 leave
0042 c3 ret

0000 55 push %rbp
0001 4889e5 mov %rsp, %rbp
0004 53 push %rbx
0005 4881ec88010000 sub $0x188, %rsp
0012 89bd7cfeffff mov %edi, -0x184(%rbp)
0018 89b578feff invalid


All that for the question in the title. I'm hoping that this is just a blinders problem, and I missed the obvious when looking through the tcc source headers.



Have good










share|improve this question


























    up vote
    0
    down vote

    favorite












    For exploration and demonstration, tried mixing a little tcc and udis86. This is from GnuCOBOL, so there is no real access to C headers, or members of the TCCState struct, but it can be done with a little bit of manual work and/or text preprocessing to get proper widths and offsets.



    tcc built with --disable-static, to build a shared library. udis86 right out of the Fedora repos.



      *> tcc-udis  tcc as libtcc.so, and udis86 for some disassembly
    *> Tectonics:
    *> cobc -xj -g tcc-udis.cob -ltcc -ludis86
    *>
    >>SOURCE FORMAT IS FREE
    identification division.
    program-id. sample.

    environment division.
    configuration section.
    repository.
    function all intrinsic.

    REPLACE ==newline== BY ==& x"0a" &==.

    data division.
    working-storage section.

    01 TCC-OUTPUT-MEMORY constant as 1.
    01 TCC-RELOCATE-AUTO usage pointer.

    01 tcc usage pointer.
    01 rc usage binary-long.
    01 prog-entry usage program-pointer.

    01 udis pic x(632). *> sizeof(ud_t)
    01 ud-translate usage program-pointer. *> AT&T or INTEL
    01 code-size usage binary-long.
    01 formatted usage pointer.
    01 running-offset pic 9999.
    01 spacer pic x(32). *> max 16 byte instruction

    01 prog. 05 value
    "#include <stdio.h>" newline
    "int hello() { printf(""%s"", ""Hello, tccn""); }"
    .

    procedure division.

    call "tcc_new" returning tcc
    if tcc equal null then
    display "error: tcc_new failed" upon syserr
    goback
    end-if

    call "tcc_set_output_type" using by value tcc TCC-OUTPUT-MEMORY
    returning rc
    if rc not equal zero then
    display "error: tcc_set_output_type " rc upon syserr
    goback
    end-if

    call "tcc_compile_string" using by value tcc by reference prog
    returning rc
    if rc not equal zero then
    display "error: tcc_compile_string " rc upon syserr
    goback
    end-if

    *> in C this is set to (void*)1
    set TCC-RELOCATE-AUTO up by 1
    call "tcc_relocate" using by value tcc
    by value TCC-RELOCATE-AUTO
    returning rc
    if rc not equal zero then
    display "error: tcc_relocate " rc upon syserr
    goback
    end-if

    call "tcc_get_symbol" using by value tcc by reference "hello"
    returning prog-entry
    if prog-entry equal null then
    display "error: tcc_get_symbol hello " upon syserr
    goback
    end-if

    call prog-entry

    move 43 to code-size
    perform disassemble
    display space

    set prog-entry to entry "cob_embed_python"
    move 23 to code-size
    perform disassemble

    call "tcc_delete" using by value tcc returning omitted
    goback.

    *> take a look at some disassembly
    disassemble.
    call "ud_init" using udis
    call "ud_set_mode" using udis by value 64 *> 64bit
    call "ud_set_vendor" using udis by value 2 *> Any
    call "ud_set_input_buffer" using udis value prog-entry code-size

    set ud-translate to entry "ud_translate_att"
    call "ud_set_syntax" using udis by value ud-translate

    move 0 to running-offset
    call "ud_disassemble" using udis returning rc
    perform until rc equal zero

    call "ud_insn_hex" using udis returning formatted
    display running-offset space content-of(formatted)
    spacer(1:32 - rc * 2) with no advancing

    add rc to running-offset

    call "ud_insn_asm" using udis returning formatted
    display space content-of(formatted)

    call "ud_disassemble" using udis returning rc
    end-perform
    .
    end program sample.


    Udis ud_set_input_buffer wants a size. It would be nice to be able to use an exact value, as determined by tcc and TCC_OUTPUT_MEMORY.



    Works fairly well for exploring, but the code-size used in the sample is just guesses followed by runs followed by counting followed by source changes and more runs. The length was purposefully short to cut off an instruction in the second disassemble, as a demonstration.



    prompt$ cobc -xj -g tcc-udis.cob -ltcc -ludis86
    Hello, tcc
    0000 55 push %rbp
    0001 4889e5 mov %rsp, %rbp
    0004 4881ec00000000 sub $0x0, %rsp
    0011 488d0571100000 lea 0x1071(%rip), %rax
    0018 4889c6 mov %rax, %rsi
    0021 488d0564100000 lea 0x1064(%rip), %rax
    0028 4889c7 mov %rax, %rdi
    0031 b800000000 mov $0x0, %eax
    0036 e817000000 call 0x40
    0041 c9 leave
    0042 c3 ret

    0000 55 push %rbp
    0001 4889e5 mov %rsp, %rbp
    0004 53 push %rbx
    0005 4881ec88010000 sub $0x188, %rsp
    0012 89bd7cfeffff mov %edi, -0x184(%rbp)
    0018 89b578feff invalid


    All that for the question in the title. I'm hoping that this is just a blinders problem, and I missed the obvious when looking through the tcc source headers.



    Have good










    share|improve this question
























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      For exploration and demonstration, tried mixing a little tcc and udis86. This is from GnuCOBOL, so there is no real access to C headers, or members of the TCCState struct, but it can be done with a little bit of manual work and/or text preprocessing to get proper widths and offsets.



      tcc built with --disable-static, to build a shared library. udis86 right out of the Fedora repos.



        *> tcc-udis  tcc as libtcc.so, and udis86 for some disassembly
      *> Tectonics:
      *> cobc -xj -g tcc-udis.cob -ltcc -ludis86
      *>
      >>SOURCE FORMAT IS FREE
      identification division.
      program-id. sample.

      environment division.
      configuration section.
      repository.
      function all intrinsic.

      REPLACE ==newline== BY ==& x"0a" &==.

      data division.
      working-storage section.

      01 TCC-OUTPUT-MEMORY constant as 1.
      01 TCC-RELOCATE-AUTO usage pointer.

      01 tcc usage pointer.
      01 rc usage binary-long.
      01 prog-entry usage program-pointer.

      01 udis pic x(632). *> sizeof(ud_t)
      01 ud-translate usage program-pointer. *> AT&T or INTEL
      01 code-size usage binary-long.
      01 formatted usage pointer.
      01 running-offset pic 9999.
      01 spacer pic x(32). *> max 16 byte instruction

      01 prog. 05 value
      "#include <stdio.h>" newline
      "int hello() { printf(""%s"", ""Hello, tccn""); }"
      .

      procedure division.

      call "tcc_new" returning tcc
      if tcc equal null then
      display "error: tcc_new failed" upon syserr
      goback
      end-if

      call "tcc_set_output_type" using by value tcc TCC-OUTPUT-MEMORY
      returning rc
      if rc not equal zero then
      display "error: tcc_set_output_type " rc upon syserr
      goback
      end-if

      call "tcc_compile_string" using by value tcc by reference prog
      returning rc
      if rc not equal zero then
      display "error: tcc_compile_string " rc upon syserr
      goback
      end-if

      *> in C this is set to (void*)1
      set TCC-RELOCATE-AUTO up by 1
      call "tcc_relocate" using by value tcc
      by value TCC-RELOCATE-AUTO
      returning rc
      if rc not equal zero then
      display "error: tcc_relocate " rc upon syserr
      goback
      end-if

      call "tcc_get_symbol" using by value tcc by reference "hello"
      returning prog-entry
      if prog-entry equal null then
      display "error: tcc_get_symbol hello " upon syserr
      goback
      end-if

      call prog-entry

      move 43 to code-size
      perform disassemble
      display space

      set prog-entry to entry "cob_embed_python"
      move 23 to code-size
      perform disassemble

      call "tcc_delete" using by value tcc returning omitted
      goback.

      *> take a look at some disassembly
      disassemble.
      call "ud_init" using udis
      call "ud_set_mode" using udis by value 64 *> 64bit
      call "ud_set_vendor" using udis by value 2 *> Any
      call "ud_set_input_buffer" using udis value prog-entry code-size

      set ud-translate to entry "ud_translate_att"
      call "ud_set_syntax" using udis by value ud-translate

      move 0 to running-offset
      call "ud_disassemble" using udis returning rc
      perform until rc equal zero

      call "ud_insn_hex" using udis returning formatted
      display running-offset space content-of(formatted)
      spacer(1:32 - rc * 2) with no advancing

      add rc to running-offset

      call "ud_insn_asm" using udis returning formatted
      display space content-of(formatted)

      call "ud_disassemble" using udis returning rc
      end-perform
      .
      end program sample.


      Udis ud_set_input_buffer wants a size. It would be nice to be able to use an exact value, as determined by tcc and TCC_OUTPUT_MEMORY.



      Works fairly well for exploring, but the code-size used in the sample is just guesses followed by runs followed by counting followed by source changes and more runs. The length was purposefully short to cut off an instruction in the second disassemble, as a demonstration.



      prompt$ cobc -xj -g tcc-udis.cob -ltcc -ludis86
      Hello, tcc
      0000 55 push %rbp
      0001 4889e5 mov %rsp, %rbp
      0004 4881ec00000000 sub $0x0, %rsp
      0011 488d0571100000 lea 0x1071(%rip), %rax
      0018 4889c6 mov %rax, %rsi
      0021 488d0564100000 lea 0x1064(%rip), %rax
      0028 4889c7 mov %rax, %rdi
      0031 b800000000 mov $0x0, %eax
      0036 e817000000 call 0x40
      0041 c9 leave
      0042 c3 ret

      0000 55 push %rbp
      0001 4889e5 mov %rsp, %rbp
      0004 53 push %rbx
      0005 4881ec88010000 sub $0x188, %rsp
      0012 89bd7cfeffff mov %edi, -0x184(%rbp)
      0018 89b578feff invalid


      All that for the question in the title. I'm hoping that this is just a blinders problem, and I missed the obvious when looking through the tcc source headers.



      Have good










      share|improve this question













      For exploration and demonstration, tried mixing a little tcc and udis86. This is from GnuCOBOL, so there is no real access to C headers, or members of the TCCState struct, but it can be done with a little bit of manual work and/or text preprocessing to get proper widths and offsets.



      tcc built with --disable-static, to build a shared library. udis86 right out of the Fedora repos.



        *> tcc-udis  tcc as libtcc.so, and udis86 for some disassembly
      *> Tectonics:
      *> cobc -xj -g tcc-udis.cob -ltcc -ludis86
      *>
      >>SOURCE FORMAT IS FREE
      identification division.
      program-id. sample.

      environment division.
      configuration section.
      repository.
      function all intrinsic.

      REPLACE ==newline== BY ==& x"0a" &==.

      data division.
      working-storage section.

      01 TCC-OUTPUT-MEMORY constant as 1.
      01 TCC-RELOCATE-AUTO usage pointer.

      01 tcc usage pointer.
      01 rc usage binary-long.
      01 prog-entry usage program-pointer.

      01 udis pic x(632). *> sizeof(ud_t)
      01 ud-translate usage program-pointer. *> AT&T or INTEL
      01 code-size usage binary-long.
      01 formatted usage pointer.
      01 running-offset pic 9999.
      01 spacer pic x(32). *> max 16 byte instruction

      01 prog. 05 value
      "#include <stdio.h>" newline
      "int hello() { printf(""%s"", ""Hello, tccn""); }"
      .

      procedure division.

      call "tcc_new" returning tcc
      if tcc equal null then
      display "error: tcc_new failed" upon syserr
      goback
      end-if

      call "tcc_set_output_type" using by value tcc TCC-OUTPUT-MEMORY
      returning rc
      if rc not equal zero then
      display "error: tcc_set_output_type " rc upon syserr
      goback
      end-if

      call "tcc_compile_string" using by value tcc by reference prog
      returning rc
      if rc not equal zero then
      display "error: tcc_compile_string " rc upon syserr
      goback
      end-if

      *> in C this is set to (void*)1
      set TCC-RELOCATE-AUTO up by 1
      call "tcc_relocate" using by value tcc
      by value TCC-RELOCATE-AUTO
      returning rc
      if rc not equal zero then
      display "error: tcc_relocate " rc upon syserr
      goback
      end-if

      call "tcc_get_symbol" using by value tcc by reference "hello"
      returning prog-entry
      if prog-entry equal null then
      display "error: tcc_get_symbol hello " upon syserr
      goback
      end-if

      call prog-entry

      move 43 to code-size
      perform disassemble
      display space

      set prog-entry to entry "cob_embed_python"
      move 23 to code-size
      perform disassemble

      call "tcc_delete" using by value tcc returning omitted
      goback.

      *> take a look at some disassembly
      disassemble.
      call "ud_init" using udis
      call "ud_set_mode" using udis by value 64 *> 64bit
      call "ud_set_vendor" using udis by value 2 *> Any
      call "ud_set_input_buffer" using udis value prog-entry code-size

      set ud-translate to entry "ud_translate_att"
      call "ud_set_syntax" using udis by value ud-translate

      move 0 to running-offset
      call "ud_disassemble" using udis returning rc
      perform until rc equal zero

      call "ud_insn_hex" using udis returning formatted
      display running-offset space content-of(formatted)
      spacer(1:32 - rc * 2) with no advancing

      add rc to running-offset

      call "ud_insn_asm" using udis returning formatted
      display space content-of(formatted)

      call "ud_disassemble" using udis returning rc
      end-perform
      .
      end program sample.


      Udis ud_set_input_buffer wants a size. It would be nice to be able to use an exact value, as determined by tcc and TCC_OUTPUT_MEMORY.



      Works fairly well for exploring, but the code-size used in the sample is just guesses followed by runs followed by counting followed by source changes and more runs. The length was purposefully short to cut off an instruction in the second disassemble, as a demonstration.



      prompt$ cobc -xj -g tcc-udis.cob -ltcc -ludis86
      Hello, tcc
      0000 55 push %rbp
      0001 4889e5 mov %rsp, %rbp
      0004 4881ec00000000 sub $0x0, %rsp
      0011 488d0571100000 lea 0x1071(%rip), %rax
      0018 4889c6 mov %rax, %rsi
      0021 488d0564100000 lea 0x1064(%rip), %rax
      0028 4889c7 mov %rax, %rdi
      0031 b800000000 mov $0x0, %eax
      0036 e817000000 call 0x40
      0041 c9 leave
      0042 c3 ret

      0000 55 push %rbp
      0001 4889e5 mov %rsp, %rbp
      0004 53 push %rbx
      0005 4881ec88010000 sub $0x188, %rsp
      0012 89bd7cfeffff mov %edi, -0x184(%rbp)
      0018 89b578feff invalid


      All that for the question in the title. I'm hoping that this is just a blinders problem, and I missed the obvious when looking through the tcc source headers.



      Have good







      cobol disassembly tcc






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 9 at 8:07









      Brian Tiffin

      3,03411831




      3,03411831





























          active

          oldest

          votes











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














           

          draft saved


          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53221913%2fafter-tcc-compile-string-is-there-a-way-to-get-the-length-of-the-code-region%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown






























          active

          oldest

          votes













          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















           

          draft saved


          draft discarded



















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53221913%2fafter-tcc-compile-string-is-there-a-way-to-get-the-length-of-the-code-region%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Schultheiß

          Verwaltungsgliederung Dänemarks

          Liste der Kulturdenkmale in Wilsdruff