Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  nodelist  faq  login

I was attacked by dselect as a small child and have since avoided debian. -- Andrew Morton


programming / comp.lang.asm.x86 / Re: Disassembly of old Turbo Pascal (V3) code - how to create data

SubjectAuthor
o Re: Disassembly of old Turbo Pascal (V3) code - how to create dataRobert Prins

1
Subject: Re: Disassembly of old Turbo Pascal (V3) code - how to create data
From: Robert Prins
Newsgroups: comp.lang.asm.x86
Organization: A noiseless patient Spider
Date: Fri, 7 May 2021 22:09 UTC
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: rob...@nospicedham.prino.org (Robert Prins)
Newsgroups: comp.lang.asm.x86
Subject: Re: Disassembly of old Turbo Pascal (V3) code - how to create data
Date: Fri, 7 May 2021 22:09:21 +0000
Organization: A noiseless patient Spider
Lines: 113
Sender: Robert Prins <robert.ah.prins@gmail.com>
Approved: fbkotler@myfairpoint.net - comp.lang.asm.x86 moderation team.
Message-ID: <3d02ef5f-ae86-12cd-b9e5-ef03807add29@prino.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: reader02.eternal-september.org; posting-host="427e9ac8a1a8212b552c0c364e5d6ad4";
logging-data="10858"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+J0YVuYwtTq8v9xrpYXaCwg4T3cv0uSLk="
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.1
Cancel-Lock: sha1:3dTUkzhjfVcoe/sG2lZfjnqQ5XY=
View all headers
On 2021-04-17 13:48, Robert Prins wrote:
 > Hi all,
 >
 > I would like to disassemble the final version of a self-written Turbo Pascal V3 program, i.e. a simple .COM file, and to that effect I've dug out my old (AD 2004) registered copy of IDA Pro (V4.7.0.831). Not having used it for more than 10 years, and no longer having access to their forum, I'm now stuck. The .COM file loads, IDA happily disassembles it, but it just creates one single segment, and I have no (longer) a clue on how to create the data segment. There's a bit of info in the TP3 Manual, and using David Lindauer's GRDB in DOXBox-X allows me to single-step through the RTL initialisation code and that shows me it sets up up DS and SS, but it doesn't help me in setting up these segments in IDA.
 >
 > I've tried the "Create Segment" option, but I'm lost entering the required values for start address, end address and base, "class" is probably "DATA", the once for the single "seg000" that IDA creates are CODE, start @ 0x0100, end @ 0xD623, which leads me to assume that a to-be-created "seg001" should start at 0x0000, end at 0xffff, and have a base of 0xd63 (paragraphs), but that results in a "Bad segment base: segment would have bytes with a negative offset" pop-up.
 >
 > Trying start @ 0xd630, end @ 0x1d630, with a base 0x0000 creates a segment, but it looks like
 >
 > seg000:D622
 > seg001:C8C00 ; ---------------------------------------------------------------------------
 > seg001:C8C00
 > seg001:C8C00 ; Segment type: Regular
 > seg001:C8C00 seg001          segment byte public '' use16
 > seg001:C8C00                 assume cs:seg001
 > seg001:C8C00                 ;org 0C8C00h
 > seg001:C8C00                 assume es:nothing, ss:nothing, ds:nothing, fs:nothing, gs:nothing
 >
 > Which may be correct, but the "org 0c8c00" makes absolutely no sense to me.

I've had a bit, or rather, a huge, amount of help from Hex-Rays' Ilfak Guilfanov, and using the names in "scg.zip" (found @ https://www.pcengines.ch/tp3.htm I've got a complete disassembly of the compiler. I cut down the IDA generated .IDC file to include just the info about the RTL, manually changed some data, which at some stage should be done with built-in IDC functions, wrote a bit of REXX to add identifiers to every Pascal procedure (basically inline statements that jump over upper-cased procedure names in Pascal-string format) and got myself a nice assembly listing, with code that's obviously working, but very "simple" (Let's just leave it at that...)

I could let IDA generate an assembler listing, hack that to pieces, most likely in some automated way, as there are dozens of procedures that look like

cseg:4E77 proc            day_ptr_is_td_top near
cseg:4E77
cseg:4E77                 push    bp
cseg:4E78                 mov     bp, sp
cseg:4E7A                 push    bp
cseg:4E7B                 jmp     $+3
cseg:4E7E ; ------------------------------------------------------------
cseg:4E7E
cseg:4E7E @01:
cseg:4E7E                 jmp     short @02
cseg:4E7E ; ------------------------------------------------------------
cseg:4E80                 db 17,'DAY_PTR_IS_TD_TOP'
cseg:4E92 ; ------------------------------------------------------------
cseg:4E92
cseg:4E92 @02:
cseg:4E92                 mov     eax, [td_top]
cseg:4E96                 mov     [day_ptr], eax
cseg:4E9A                 mov     [winday_top], eax
cseg:4E9E                 call    _day_list_is_day_ptr
cseg:4EA1                 jmp     $+3
cseg:4EA4 ; ------------------------------------------------------------
cseg:4EA4
cseg:4EA4 @03:
cseg:4EA4                 mov     sp, bp
cseg:4EA6                 pop     bp
cseg:4EA7                 retn
cseg:4EA7 endp            day_ptr_is_td_top

where a stack-frame isn't required, and likewise for the "jmp $+3"'s.

However, right now I've started to think about something else, making a few tweaks to the compiler itself. IDA Pro has a built-in assemble command, and can save a changed .COM file, but that would result in an output file with just a lot of NOP instructions, like 20+ in the random number generator

x(n+1) = (x(n) * 129 + 907633385) mod 2^32

32-bit multiplication and addition are easier on a 32-bit CPU than on a 16-bit one...

But of course it would be more interesting to see if it's possible to retrofit Norbert Juffa's enhanced 6-byte-real IEEE-compliant (as far as that's possible in this format) arithmetic to the RTL. That however would not realistically possible via the assemble command, but would require a real reassembly. IDA Pro provides two options for generating source, "generic" (aka MASM?) or TASM "Ideal" mode.

Now I can probably figure out what to change where to let Turbo set up its segmentation magic, but my disassembly contains a data segment with the uninitialised RTL variables, and I don't want/need that in a .COM file. The assembler listing generated by the program in the above-mentiond scg.zip and to be assembled with "AS" from the same just has a series of "var = value" to set up these variables. So is there a way to create in TASM/MASM some kind of "dummy" data segment just to set up variable names/offsets? Googling on dummy/virtual segment doesn't come up with anything helpful, but I'm sure that this is not an uncommon situations.

Robert
--
Robert AH Prins
robert(a)prino(d)org
The hitchhiking grandfather - https://prino.neocities.org/indez.html
Some REXX code for use on z/OS - https://prino.neocities.org/zOS/zOS-Tools.html



1
rocksolid light 0.7.2
clearneti2ptor