Experience with lhTools for PC-1500

Talk in English

Modérateur : Politburo

kuzja
Fonctionne à 75 bauds
Fonctionne à 75 bauds
Messages : 17
Enregistré le : 01 avr. 2021 14:26

Re: Experience with lhTools for PC-1500

Message par kuzja »

I did some testing of the new version of lhasm (ver. 0.7.8 ) and I can confirm it corrects the issues I reported in my first post, namely the points 2, 3, 4 and 7. Thanks a lot for a good job! :)

I am a bit in doubts as to if I should go into more detail with my observations. It's because while the corrected issues could have some practical impact and are improving the comfort for the programmer, the others are rather nitpicking :) and won't probably bring much effect even if improved...

Well, I give the basic idea, but really, this is nothing that limits the work with lhasm. I don't want to be ungrateful, lhTools are wonderful utilities and these small things cannot spoil the good impression about them.

The matter is there are several contexts in which the assembler expects a value. I've identified these different cases (there can be even more):
  1. .ORIGIN directive
  2. .EQU directive
  3. 8-bit immediate value like in LDA n
  4. 16-bit immediate value like in LD SP,mn
  5. displacement like in JR d
  6. absolute address in JP ab
These values could be (theoretically) formed by a numeric constant, expression, symbol, variable or by a combination of these. Of course, there can be intentional limitations (e.g. .ORIGIN could accept only a numeric constant), but somehow I would expect that the evaluation of the "value" (in any form) would be the same all the time.

However it seems the assembler uses different routines to evaluate the value in different contexts. As a result, the same expression is accepted in one context, but refused in another.
Below are several examples.

Example 1: Using decimal value in expression

Code : Tout sélectionner

	LD H,[+#10]55	; compilation error
	JR +#10		; ok
VAR1	.EQU +#160	; ok
VAR2	.EQU  [+#10]4455	; compiles incorrectly as &440A
Example 2: Using (not) leading zero(s)

Code : Tout sélectionner

	LDA 5	; compilation error, should be '05'
	LDA [+5]00	; ok - in expression '0' not needed
; similarly:
VAR1	.EQU 445	; error
VAR2	.EQU [+A]4455	; ok
	JR +3	; error
	JR +[+3]00	; ok
A special case are the values accepted for .ORIGIN directive:

Code : Tout sélectionner

	.ORIGIN 4400	; ok - standard use
	.ORIGIN 440	; ok, even though leading '0' is missing
	.ORIGIN &4400	; ok - new corrected behaviour
	.ORIGIN \X4400	; error (also for \U, \O)
	.ORIGIN #10000	; ok
	.ORIGIN @37777	; error
	.ORIGIN VAR1	; ok
	.ORIGIN [+100]VAR1	; error
...which differs from other use cases.

I can do more testing if required, but again, I want to stress this is nothing important, and I would not come across these things if it wouldn't be for the previous discussion and the new version's testing. I suppose the observed behaviour has historical reasons and it would be probably difficult - and very likely not worth - to unify it for various context cases...

So thanks again for your work!
cgh
Fonctionne à 2400 bauds
Fonctionne à 2400 bauds
Messages : 2142
Enregistré le : 30 août 2011 12:23
Localisation : Vous êtes ici -> .

Re: Experience with lhTools for PC-1500

Message par cgh »

Thanks a lot for all the tests and the reports here :)

In fact, you pointed what is the is badest and most obscure "part" of lhasm: its evaluator !
The development in "way to stay compatible" with older versions of lhdump (and also my sources) and the perpetual enhancements have built a monster that is ambigous and somewhat confusing. This is clearly the part that I need to fully rework. And all choices made were not always good...

I won't change it now, as I expect to release the ultimate and last release 0.8.0 soon.... hmmm... this year... I hope !

The evaluator has inherited from the ancestor of lhasm, named lhbin and provided with the earliest versions. As lhasm is still able to deal with lhbin sources, I keep the current evaluator. Sorry. lhasm was introduced in the 0.4.0 (2014) and has replaced the old lhbin.
In the first time of the evaluator, only hexadecimal values, symbols and variables were allowed. Hexadecimal value must have a length that must be a multiple of 2, that is: 3, and 003 are invalid but 03 and 0003 are valid. In your example, if JR +3 is rejected, JR +03 is accepted. By the way, the following source is correct and produce the expected code:

Code : Tout sélectionner

3	.EQU	03
	JR	+3
As you underlined it, it depends of the context. But this is not the evalutor which is context dependant, but the directives or instructions. So, this why some directives (.ORIGIN, .EQU) and some instructions may accept or reject some contructions that look valid for others.

I will try to describe this by an example of the instruction LDA. There are several syntaxes valid:

Code : Tout sélectionner

LDA rH		: High 8 bits register: B, D, or H (1)
LDA rL		: Low 8 bits register: C, E or L (2)
(3) LDA (R)	: 16-bits register indirection (BD), (DE) or (HL) (3)
(4) LDA F	: Status flags register (4)
LDA (mn)	: 16-bits address indirection (5)
LDA n		: 8 bits immediate value (6)
As lhasm and lhdump share the same table for the description of the instructions (see lh5801.c), the assembler fetches the closest instruction matching the arguments. if you write LDA B, nothing is possible except the load to A with a high 8 bits register (1). If you write LDA 00, nothing is possible except a load to A of the immediate 8 value 00 (6). If you write LDA BC, this more complicated: BC matches a 16-bits register but no instruction is fetchable. So BC is expected to be a 8 bits value and the assembler choses LDA &BC (6). To prevent any ambiguity, I have introduced the values markers: &<hexa>, #<decimal>, @<octal>, $<ascii-char>.

The directive .ORIGIN: is specific as it does not call the evaluator. So it accept only symbols, <hexa 2 or 4 characters length>, &<hexa>, \x<hexa>, #<decimal> and \u<decimal>. I discover that I did a mistake for octal that are read as decimal... The usage of \x, \u and \o is wrong and must be \X, \U and \O; I never see that because but I don't use this syntax... :oops:

The evaluator is not called by all directives because some (like .ORIGIN: .ALIGN:) require to have an immediate and non ambigous value that must not change between the passes 1 and 2.

Really great thanks to kuzja for his time spent in testing lhasm :geek: :D
Il y a ceux qui voient les choses telles qu'elles sont et se demandent pourquoi, et il y a ceux qui imaginent les choses telles qu'elles pourraient être et se disent... pourquoi pas? - George Bernard Shaw
J'adore parler de rien, c'est le seul domaine où j'ai de vagues connaissances ! - Oscar Wilde
Ce n'est pas parce que les choses sont difficiles que nous n'osons pas. C'est parce que nous n'osons pas que les choses sont difficiles. - Sénèque
kuzja
Fonctionne à 75 bauds
Fonctionne à 75 bauds
Messages : 17
Enregistré le : 01 avr. 2021 14:26

Re: Experience with lhTools for PC-1500

Message par kuzja »

Thanks for an interesting background information!

Backwards compatibility is always a difficult thing to handle and usually there is no flawless solution. But I agree it is important to keep it so that older source codes can be compiled. There are several ways how to cope with it, but they have their disadvantages too. Looking at the other projects I've seen, I can think of two "workarounds":
- The program uses some kind of parameter (a "compiler directive") that allows to switch between "old" and "new" rules. In fact, there have to be two programs in one.
- The program is no more compatible with older input data, but there is a separate tool (a "pre-processor") that allows conversion of the old data into the new format. Then the program as such is simplified, but extra effort to develop the tool is needed.

But these are just my thoughts on the "backward compatibility" topic; it's clear that for a project like this, the least demanding solution has to be chosen and all the "nice to have" things have to go aside.
In the first time of the evaluator, only hexadecimal values, symbols and variables were allowed. Hexadecimal value must have a length that must be a multiple of 2, that is: 3, and 003 are invalid but 03 and 0003 are valid.
Actually, this requirement is documented in the manual, so what is surprising is the fact that it is not required in an expression like [+3]. :)

You call the evaluator the most obscure part of lhasm, but it still does a great job! :)
cgh
Fonctionne à 2400 bauds
Fonctionne à 2400 bauds
Messages : 2142
Enregistré le : 30 août 2011 12:23
Localisation : Vous êtes ici -> .

Re: Experience with lhTools for PC-1500

Message par cgh »

kuzja a écrit : 20 avr. 2021 21:40
In the first time of the evaluator, only hexadecimal values, symbols and variables were allowed. Hexadecimal value must have a length that must be a multiple of 2, that is: 3, and 003 are invalid but 03 and 0003 are valid.
Actually, this requirement is documented in the manual, so what is surprising is the fact that it is not required in an expression like [+3]. :)
The "obscure" evaluator :oops: :oops: :oops: :mrgreen:
No, in fact, when a value is written [opval], the val is read directly. The evaluator does not call itself in this case. The feature to call the evalutaor using the quote sign ' was introduced later, due to my needs for a software I was developing at this time...

I am fully agree with your description about backward compatibility and this is what I expect when I will start the rebuild of the lhTools. At this time, I do not know if I will provide a translation tool or support a "backward" option to legacy syntax.

May the PC1500 be with you.
Il y a ceux qui voient les choses telles qu'elles sont et se demandent pourquoi, et il y a ceux qui imaginent les choses telles qu'elles pourraient être et se disent... pourquoi pas? - George Bernard Shaw
J'adore parler de rien, c'est le seul domaine où j'ai de vagues connaissances ! - Oscar Wilde
Ce n'est pas parce que les choses sont difficiles que nous n'osons pas. C'est parce que nous n'osons pas que les choses sont difficiles. - Sénèque
kuzja
Fonctionne à 75 bauds
Fonctionne à 75 bauds
Messages : 17
Enregistré le : 01 avr. 2021 14:26

Re: Experience with lhTools for PC-1500

Message par kuzja »

I have one more observation to the behaviour of the .EQU directive.
As mentioned in an earlier post, in lhasm ver. 0.7.8, this works correctly:

Code : Tout sélectionner

VAR1	.EQU 4455
VAR2	.EQU VAR1	; evaluates to 4455
VAR3	.EQU [+100]VAR1	; evaluates to 4555
(There shall be no prefixes like &, #, @, but it is another issue, not relevant to this post.)
However, just changing the name of the symbol, the evaluation fails:

Code : Tout sélectionner

BAR1	.EQU 4455
BAR2	.EQU BAR1	; evaluates to 00BA
BAR3	.EQU [+100]BAR1	; evaluates to 01BA
Very likely it is related to the fact the symbol name begins with a letter that can be interpreted as a hexadecimal digit...

I understand it is not worth to change this in 0.7.x, but I thought I'd rather post this problem so that it can be checked if it persists in the oncoming version 0.8.0.
Répondre

Retourner vers « Silicium in English »