novaBBS - comp.lang.c - Re: How to optimize this search?

On 24/08/2022 01:54, antispam@math.uni.wroc.pl wrote:
> Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
>> bart c <bart4858@gmail.com> writes:
>>> On Tuesday, 23 August 2022 at 20:17:18 UTC+1, Keith Thompson wrote:
>> [...]
>>>> And that's a *lot* of "define"s. It sounds like the code you're
>>>> looking at is not typical. Your numbers don't match what I see
>>>> in the latest sqlite3 sources.
>>>
>>> Yeah, that looked dodgy. I don't know what's going on there. If I
>>> compile only sqlite3.c, then I get more reasonable looking figures:
>>>
>>> 14010 define
>>> 08728 if
>>> 08543 int
>>> 03115 endif
>>> 03111 return
>>> 02658 void
>>> 02291 else
>>>
>>> Note that 10,000 of those defines are inside my windows.h. Also note
>>> that the same 'if' keyword is used for "'#if".
>>>
>>> This is specific to normal C however. We don't know exactly what the
>>> OP is doing, or what his inputs will be, whether they are already
>>> preprocessed or not.
>>
>> Ah, so that's 14010 occurrences of "define" after preprocessing.
>
> Bart wrote that "define" came from "windows.h", so his input is
> before preprocessing.
>
>> The point is that if you're processing C source code and recognizing
>> whether tokens that look like identifiers are keywords or not, it
>> doesn't make sense to have "define" and "int" in the same table.
>
> Why not? Of course logically lookups for preprocessor keywords
> are different from normal lookups, but there is no law that
> prevent storing two logical hash tables in single physical
> hash table. In particular, info about preprocessor keywords
> is likely to be quite small, so there is good chance that
> you can put it in place that would be otherwise wasted by
> padding.
>
> If you want fast compiler it makes sense to do symbol
> table lookup once, that is combine prepocessor with compiler.
> In preprocessing stage you need to look up each identifier to
> check if it is a macro, when it is not a macro you may deliver
> its compilation meaninig. Details needed for correct handling
> of C are likely to be tricky and I do not know if Bart did
> this correctly. But this is clearly doable and likely to give
> large speed advantage.
>
> Note that after macros and normal C symbols are in single
> table what remains is handful of preprocessor keywords.
> Since they are so frequent and exclusive of normal identifiers
> special purpose table may give some speed boost. OTOH
> in combined table preprocessor keywords will be entered
> first, so with high probablity single lookup will work.
> In other words, single combined table is likely to behave
> as perfect hash for the purpose of lookup for proprocessor
> keywords. It looks tricky to come with faster alternative...
>

You /could/ put your pre-processor symbols (preprocessor keywords and
macro identifiers) in the same table as your "main-phase" symbols
(language keywords and identifiers). But you'd need to distinguish the
symbol types in the tables - and you need to look up tokens again after
preprocessing anyway. So I can't see how you could gain anything from
combining them.

Consider this translation unit :

#define foo in
#define bar t
#define foobar in ## t

foobar foo, bar, define;

This has the same result as "int foo, bar, define;". But there is no
string "int" in the input text, and the strings in the input change
meaning radically from before and after pre-processing.

(I apologise for being a bit lose on terminology here, especially
processing phases - I hope I am being clear enough despite that.)

Ben Bacarisse to Anton Shepelev:

> > Thiago wrote nothing about any other strings than the
> > static set of keywords known in advance.
>
> It would be very odd to only lookup strings that you know
> are in a table. And very, very odd given the
> context -- language keywords. TA will want to know if a
> token is or is not a keyword.

Yes, it makes no sense for a lexical analyser, but may be
useful for some queer task. I will try to gnaw out some
time his evening to post my solution.

--
() ascii ribbon campaign - against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]

On Wednesday, 24 August 2022 at 01:43:08 UTC+1, Thiago Adams wrote:
> On Tuesday, August 23, 2022 at 8:36:26 PM UTC-3, bart c wrote:
> > On Tuesday, 23 August 2022 at 23:55:43 UTC+1, Thiago Adams wrote:
> >
> > > Jus to document I also tried something like:
> > >
> > > if (ch < 'c')
> > > {
> > > if (ch == 'a') ...
> > > if (ch == 'b') ...
> > > }
> > > else
> > > {
> > > if (ch == 'c') ...
> > > if (ch == 'd') ...
> > > ...
> > > }
> > >
> > > That is similar of binary search .
> > > We can add more and more ifs like this.
> > This is starting to look a little desperate. Such code looks unmaintainable to me (but I assume you will use some tool to generate it).
> >
> > What is your actual aim here? Granted a simple linear search is not going to be that performant. But pretty much anything else is going to be faster. So how much faster does it have to be; have you done any measurements? Is checking for keywords mixed in with any other input such as punctuation or numeric literals or strings or comments; what's the bigger picture?
> One of my objectives is to understand what compilers can do on switch cases
> and to write equivalent code.
>
> This binary search didn't change the performance. My tests are a bunch of files
> that takes 9.4s-10s. And I guess all changes I did are changing less than 0.5s so
> it hard to see.

But how much of that is due to the lookup anyway? A quick and dirty method I used to discover how much a function contributes compared to total runtime (in the absence of proper profiling tools), is to just call it twice, when there are no harmful side-effects.

It might be that your 10 seconds becomes 11 seconds, so that the relevant function takes only 1 seconds. Then an improvement of 0.5 seconds means it's twice the speed.

It also means you look need to look elsewhere to find out why its slow, if 90% of runtime is not due to the lookup function.

(For tokenising source code (not C preprocessing), a modern machine should manage at least 5M lines per second, and parsing, at least 2M lines per second. So if your test files are much less than 20M lines of code, then it could be a bit faster. If more, then there's nothing wrong with that speed.)

>
> This topic is more about exploration and to understand the options.
> Also from a compiler point of view. How switch works. Is it a lot of linear ifs?

How you never used https://godbolt.org? Put any C code in there, and look at the assembly produced. You can play around with compiler versions and options.

My own approach is this:

* Number of cases in switch is 6-8 or less: use a chain of if-else
* Number of cases have values with a span less than 500 or 1000: use a jump-table
* Otherwise use an if-else chain (C compiler); or report an error (non-C compiler, but that has another statement type which is used instead, which always does sequential tests, and also works with non-ints and non-constants)..

If you want something faster than switch, even using a jump-table, then look at computed goto. (Needs gcc extensions in C.) That uses a table of label pointers, and is popular for bytecode-dispatching in interpreters.

On Wednesday, 24 August 2022 at 08:43:41 UTC+1, David Brown wrote:
> On 24/08/2022 01:54, anti...@ wrote:
> > If you want fast compiler it makes sense to do symbol
> > table lookup once, that is combine prepocessor with compiler.
> > In preprocessing stage you need to look up each identifier to
> > check if it is a macro, when it is not a macro you may deliver
> > its compilation meaninig. Details needed for correct handling
> > of C are likely to be tricky and I do not know if Bart did
> > this correctly. But this is clearly doable and likely to give
> > large speed advantage.
> >
> > Note that after macros and normal C symbols are in single
> > table what remains is handful of preprocessor keywords.
> > Since they are so frequent and exclusive of normal identifiers
> > special purpose table may give some speed boost. OTOH
> > in combined table preprocessor keywords will be entered
> > first, so with high probablity single lookup will work.
> > In other words, single combined table is likely to behave
> > as perfect hash for the purpose of lookup for proprocessor
> > keywords. It looks tricky to come with faster alternative...
> >
> You /could/ put your pre-processor symbols (preprocessor keywords and
> macro identifiers) in the same table as your "main-phase" symbols
> (language keywords and identifiers). But you'd need to distinguish the
> symbol types in the tables - and you need to look up tokens again after
> preprocessing anyway. So I can't see how you could gain anything from
> combining them.
>
> Consider this translation unit :
>
>
> #define foo in
> #define bar t
> #define foobar in ## t
>
> foobar foo, bar, define;
>
>
> This has the same result as "int foo, bar, define;". But there is no
> string "int" in the input text, and the strings in the input change
> meaning radically from before and after pre-processing.

Your example includes these low-level tokens: 'define foo in bar t foobar'. The token-pasting synthesises a new token 'int' which also needs looking up, although that particular token is an existing keywords so already exists.

If I put your example into my compiler, it produces this main structured symbol table (keeping my fingers crossed that googlegroups shows it as fixed pitch, unlike what I'm looking at right now).

Global Symbol Table
:t---------------------------moduleid....[- M1 ]===($prog) none
----:in----------------------staticid....[Exp ]====(t) int static
----:t-----------------------staticid....[Exp ]====(t) int static
----:define------------------staticid....[Exp ]====(t) int static

This is actually superimposed (via various links between entries) onto one global
flat symbol table, like the following. This normally displays only user-defined names,
but I've got it to show built-in type names and CPP directives as well.

Indented entries are duplicate instances of any name, added via a linked list to the
generic version:

Global Flat Symbol Table:
689 0320D8C0 : int ktypespecsym nullid
979 032169C0 : foobar namesym macroid
3596 03268640 : t namesym nullid
---- 03B3E720 t namesym staticid 03268640(From t)
---- 03B3E440 t namesym moduleid 03B3E720(From $prog)
3908 03272240 : long ktypespecsym nullid
6089 032B64C0 : MESSAGE ksourcedirsym nullid
8420 032FF240 : pragma ksourcedirsym nullid
11293 03358EC0 : bar namesym macroid
13706 033A4540 : pause ksourcedirsym nullid
16809 03511540 : message ksourcedirsym nullid
17215 0351E040 : double ktypespecsym nullid
27532 036606C0 : line ksourcedirsym nullid
31476 036DBAC0 : signed ktypespecsym nullid
34091 037426C0 : unsigned ktypespecsym nullid
34684 03754F40 : elif ksourcedirsym nullid
35604 03771B40 : float ktypespecsym nullid
38404 037C9340 : short ktypespecsym nullid
38856 037D7540 : error ksourcedirsym nullid
39813 037F53C0 : define ksourcedirsym nullid
---- 03B3E7A0 define namesym staticid 037F53C0(From t)
41844 03834B40 : warning ksourcedirsym nullid
42984 03858540 : ifdef ksourcedirsym nullid
43135 0385D0C0 : _Complex ktypespecsym nullid
43832 03872D40 : endif ksourcedirsym nullid
45610 038AA640 : foo namesym macroid
48332 038FF740 : ifndef ksourcedirsym nullid
50296 03950DC0 : include ksourcedirsym nullid
51684 0397C3C0 : undef ksourcedirsym nullid
52235 0398D740 : in namesym nullid
---- 03B3E690 in namesym staticid 0398D740(From t)
53041 039A6A40 : showmacro ksourcedirsym nullid
56050 03A04AC0 : $prog namesym nullid
---- 03B3E380 $prog namesym programid 03A04AC0(From -)
56330 03A0D6C0 : char ktypespecsym nullid
56414 03A100C0 : debugon ksourcedirsym nullid
59220 03A67BC0 : debugoff ksourcedirsym nullid
62951 03ADC540 : _Bool ktypespecsym nullid
64700 03B12FC0 : void ktypespecsym nullid

It shows that there are two versions of 'define', but only one version of 'int'.
('t' is also the name of this module, t.c).

Built-in keywords exist at the 'root' entry of each unique symbol name.

User-defined names (like 3 different definitions of 'abc' across a module), start with a generic 'abc' plus 3 specific ones linked to that. Sometimes the generic name is also an allowed keyword like 'define'.

User-define macro names however are stored as the root symbol.

If someone does '#define int abc', then it hides the type 'int' with the macro 'int'.

On Wednesday, August 24, 2022 at 7:34:18 AM UTC-3, bart c wrote:
> On Wednesday, 24 August 2022 at 01:43:08 UTC+1, Thiago Adams wrote:
> > On Tuesday, August 23, 2022 at 8:36:26 PM UTC-3, bart c wrote:
> > > On Tuesday, 23 August 2022 at 23:55:43 UTC+1, Thiago Adams wrote:
> > >
> > > > Jus to document I also tried something like:
> > > >
> > > > if (ch < 'c')
> > > > {
> > > > if (ch == 'a') ...
> > > > if (ch == 'b') ...
> > > > }
> > > > else
> > > > {
> > > > if (ch == 'c') ...
> > > > if (ch == 'd') ...
> > > > ...
> > > > }
> > > >
> > > > That is similar of binary search .
> > > > We can add more and more ifs like this.
> > > This is starting to look a little desperate. Such code looks unmaintainable to me (but I assume you will use some tool to generate it).
> > >
> > > What is your actual aim here? Granted a simple linear search is not going to be that performant. But pretty much anything else is going to be faster. So how much faster does it have to be; have you done any measurements? Is checking for keywords mixed in with any other input such as punctuation or numeric literals or strings or comments; what's the bigger picture?
> > One of my objectives is to understand what compilers can do on switch cases
> > and to write equivalent code.
> >
> > This binary search didn't change the performance. My tests are a bunch of files
> > that takes 9.4s-10s. And I guess all changes I did are changing less than 0.5s so
> > it hard to see.
> But how much of that is due to the lookup anyway? A quick and dirty method I used to discover how much a function contributes compared to total runtime (in the absence of proper profiling tools), is to just call it twice, when there are no harmful side-effects.
>
> It might be that your 10 seconds becomes 11 seconds, so that the relevant function takes only 1 seconds. Then an improvement of 0.5 seconds means it's twice the speed.

It is an interesting method.
Calling it twice I got 9.73s compared with 9.28s (~.4s) . The results changes a lot +- 0.5s.
A better function will improve just about 0.1s-0.2s I guess. So, again, this is more about exploring
the problem and it will not improve the performance significantly.

Thiago Adams <thiago.adams@gmail.com> writes:

> Given a c string I need to return as fast as possible the pair (if exist) of the corresponding keyword_pair.
>
>
> struct keyword_pair{
> const char* lexeme;
> int token;
> };
>
> struct keyword_pair[] = {
> { "NULL", 0},
> { "_Alignas", 1},
> { "_Atomic", 2},
> { "_BitInt", 3},
> { "_Bool", 4},
> { "_Complex", 5},
> { "_Decimal128", 6},
> { "_Decimal32", 7},
> { "_Decimal64", 8},
> { "_Generic", 9},
> { "_Hashof", 10},
> { "_Imaginary", 11},
> { "_Noreturn", 12},
> { "_Static_assert", 13},
> { "_Thread_local", 14},
> { "__alignof", 15},
> { "__asm", 16},
> { "__forceinline", 17},
> { "__inline", 18},
> { "__int16", 19},
> { "__int32", 20},
> { "__int64", 21},
> { "__int8", 22},
> { "_asm", 23},
> { "alignas", 24},
> { "alignof", 25},
> { "auto", 26},
> { "bool", 27},
> { "break", 28},
> { "case", 29},
> { "catch", 30},
> { "char", 31},
> { "const", 32},
> { "constexpr", 33},
> { "continue", 34},
> { "default", 35},
> { "defer", 36},
> { "do", 37},
> { "double", 38},
> { "else", 39},
> { "enum", 40},
> { "extern", 41},
> { "false", 42},
> { "float", 43},
> { "for", 44},
> { "goto", 45},
> { "if", 46},
> { "inline", 47},
> { "int", 48},
> { "long", 49},
> { "nullptr", 50},
> { "register", 51},
> { "repeat", 52},
> { "restrict", 53},
> { "return", 54},
> { "short", 55},
> { "signed", 56},
> { "sizeof", 57},
> { "static", 58},
> { "static_assert", 59},
> { "struct", 60},
> { "switch", 61},
> { "thread_local", 62},
> { "throw", 63},
> { "true", 64},
> { "try", 65},
> { "typedef", 66},
> { "typeid", 67},
> { "typeof", 68},
> { "typeof_unqual", 69},
> { "union", 70},
> { "unsigned", 71},
> { "void", 72},
> { "volatile", 73},
> { "while", 74},
> };

I wanted to time some algorithms so in case we want some repeatable and
comparable timings, here is the driver program I am using:

#include <stdio.h>
#include <ctype.h>

int lookup(const char *tok, unsigned len);

int main(void)
{ int r = 0, c;
while ((c = getchar()) != EOF)
if (isalpha(c) || c == '_') {
unsigned l = 0;
char token[20];
do token[l++] = c;
while (l < sizeof token - 1 &&
(isalnum(c = getchar()) || c == '_'));
token[l] = 0;
r += lookup(token, l);
}
printf("%d\n", r);
}

The lookup function returns the token value or -1 for non-keywords.

To get an IO and driver overhead, I compiles and linked with

int lookup(const char *tok, unsigned l)
{ return 1;
}

Running on sqlite3's amalgamated source (sqlite-amalgamation-3390200)
this "null" version reports 992905 in about 0.072s. (The result,
992905, is of course wrong in this case.)

A version using the perfect hash generated by gperf returns 3231512 in
about 0.075s and is hard to distinguish from the IO-only run time.

A linear search returns 3231512 in 0.24s, and a binary search using C's
often overlooked bsearch function returns the same result in 0.1s.

All programs compiled and linked with gcc -O3. The lookup function was
always in a separate translation unit and is not being inlined.

--
Ben.

On Wednesday, August 24, 2022 at 9:28:50 AM UTC-3, Ben Bacarisse wrote:
> Thiago Adams <thiago...@gmail.com> writes:
>
> > Given a c string I need to return as fast as possible the pair (if exist) of the corresponding keyword_pair.
> >
> >
> > struct keyword_pair{
> > const char* lexeme;
> > int token;
> > };
> >
> > struct keyword_pair[] = {
> > { "NULL", 0},
> > { "_Alignas", 1},
> > { "_Atomic", 2},
> > { "_BitInt", 3},
> > { "_Bool", 4},
> > { "_Complex", 5},
> > { "_Decimal128", 6},
> > { "_Decimal32", 7},
> > { "_Decimal64", 8},
> > { "_Generic", 9},
> > { "_Hashof", 10},
> > { "_Imaginary", 11},
> > { "_Noreturn", 12},
> > { "_Static_assert", 13},
> > { "_Thread_local", 14},
> > { "__alignof", 15},
> > { "__asm", 16},
> > { "__forceinline", 17},
> > { "__inline", 18},
> > { "__int16", 19},
> > { "__int32", 20},
> > { "__int64", 21},
> > { "__int8", 22},
> > { "_asm", 23},
> > { "alignas", 24},
> > { "alignof", 25},
> > { "auto", 26},
> > { "bool", 27},
> > { "break", 28},
> > { "case", 29},
> > { "catch", 30},
> > { "char", 31},
> > { "const", 32},
> > { "constexpr", 33},
> > { "continue", 34},
> > { "default", 35},
> > { "defer", 36},
> > { "do", 37},
> > { "double", 38},
> > { "else", 39},
> > { "enum", 40},
> > { "extern", 41},
> > { "false", 42},
> > { "float", 43},
> > { "for", 44},
> > { "goto", 45},
> > { "if", 46},
> > { "inline", 47},
> > { "int", 48},
> > { "long", 49},
> > { "nullptr", 50},
> > { "register", 51},
> > { "repeat", 52},
> > { "restrict", 53},
> > { "return", 54},
> > { "short", 55},
> > { "signed", 56},
> > { "sizeof", 57},
> > { "static", 58},
> > { "static_assert", 59},
> > { "struct", 60},
> > { "switch", 61},
> > { "thread_local", 62},
> > { "throw", 63},
> > { "true", 64},
> > { "try", 65},
> > { "typedef", 66},
> > { "typeid", 67},
> > { "typeof", 68},
> > { "typeof_unqual", 69},
> > { "union", 70},
> > { "unsigned", 71},
> > { "void", 72},
> > { "volatile", 73},
> > { "while", 74},
> > };
> I wanted to time some algorithms so in case we want some repeatable and
> comparable timings, here is the driver program I am using:
>
> #include <stdio.h>
> #include <ctype.h>
>
> int lookup(const char *tok, unsigned len);
>
> int main(void)
> {
> int r = 0, c;
> while ((c = getchar()) != EOF)
> if (isalpha(c) || c == '_') {
> unsigned l = 0;
> char token[20];
> do token[l++] = c;
> while (l < sizeof token - 1 &&
> (isalnum(c = getchar()) || c == '_'));
> token[l] = 0;
> r += lookup(token, l);
> }
> printf("%d\n", r);
> }
>
> The lookup function returns the token value or -1 for non-keywords.
>
> To get an IO and driver overhead, I compiles and linked with
>
> int lookup(const char *tok, unsigned l)
> {
> return 1;
> }
>
> Running on sqlite3's amalgamated source (sqlite-amalgamation-3390200)
> this "null" version reports 992905 in about 0.072s. (The result,
> 992905, is of course wrong in this case.)
>
> A version using the perfect hash generated by gperf returns 3231512 in
> about 0.075s and is hard to distinguish from the IO-only run time.
>
> A linear search returns 3231512 in 0.24s, and a binary search using C's
> often overlooked bsearch function returns the same result in 0.1s.
>
> All programs compiled and linked with gcc -O3. The lookup function was
> always in a separate translation unit and is not being inlined.
>
> --

Can you plug this one and compare? I am not using the length.
https://godbolt.org/z/1Yh96ch5e
this one is like:

if (*s == 'N') { ... }
if (*s == 'a') { ... }
if (*s == 'b') { ... }
if (*s == 'c') { ... }
if (*s == 'd') { ... }
if (*s == 'e') { ... }
if (*s == 'f') { ... }
if (*s == 'g') { ... }
if (*s == 'i') { ... }
if (*s == 'l') { ... }
if (*s == 'n') { ... }
if (*s == 'r') { ... }
if (*s == 's') { ... }
if (*s == 't') { ... }
if (*s == 'u') { ... }
if (*s == 'v') { ... }
if (*s == 'w') { ... }
if (*s == '_') { ... }

On Wednesday, 24 August 2022 at 13:28:50 UTC+1, Ben Bacarisse wrote:

> I wanted to time some algorithms so in case we want some repeatable and
> comparable timings, here is the driver program I am using:
>
> #include <stdio.h>
> #include <ctype.h>
>
> int lookup(const char *tok, unsigned len);
>
> int main(void)
> {
> int r = 0, c;
> while ((c = getchar()) != EOF)
> if (isalpha(c) || c == '_') {
> unsigned l = 0;
> char token[20];
> do token[l++] = c;
> while (l < sizeof token - 1 &&
> (isalnum(c = getchar()) || c == '_'));
> token[l] = 0;
> r += lookup(token, l);
> }
> printf("%d\n", r);
> }
>
> The lookup function returns the token value or -1 for non-keywords.
>
> To get an IO and driver overhead, I compiles and linked with
>
> int lookup(const char *tok, unsigned l)
> {
> return 1;
> }
>
> Running on sqlite3's amalgamated source (sqlite-amalgamation-3390200)
> this "null" version reports 992905 in about 0.072s. (The result,
> 992905, is of course wrong in this case.)
>
> A version using the perfect hash generated by gperf returns 3231512 in
> about 0.075s and is hard to distinguish from the IO-only run time.
>
> A linear search returns 3231512 in 0.24s, and a binary search using C's
> often overlooked bsearch function returns the same result in 0.1s.
>
> All programs compiled and linked with gcc -O3. The lookup function was
> always in a separate translation unit and is not being inlined.

I tried your program, using Windows 10. I got a base-timing of about 0.2 seconds no matter what compiler was used.

Using a simpler linear search based on strcmp() (as I used the OP's table without lengths), the timing was about 0.5 seconds with gcc-O3; (my own compiler was 10% slower, so even with lookup() in the same file, optimisation is of little help).

OK, I decided to add precomputed lengths. Gcc-O3 now does a better job at 0..3 seconds, with mine at 0.45 seconds.

Note that my C compiler, itself unoptimised, can compile this module to .asm/.obj in 0.28/0/35 seconds. I can't test tcc on it but it is likely faster than that. Linear search still isn't great.

Although it should be clear what this is measuring, which is detecting alphanumeric tokens in the input, even inside strings and comments (sqlite3.c is about 40% extensive comments). I believe the actual number of keywords inside that file is about 60,000, so this test is doing a lot more work than necessary for compilation.

On Wednesday, August 24, 2022 at 10:26:07 AM UTC-3, Thiago Adams wrote:
> On Wednesday, August 24, 2022 at 9:28:50 AM UTC-3, Ben Bacarisse wrote:
....
> Can you plug this one and compare? I am not using the length.
> https://godbolt.org/z/1Yh96ch5e
> this one is like:
>
> if (*s == 'N') { ... }
> if (*s == 'a') { ... }
> if (*s == 'b') { ... }
> if (*s == 'c') { ... }
> if (*s == 'd') { ... }
> if (*s == 'e') { ... }
> if (*s == 'f') { ... }
> if (*s == 'g') { ... }
....
Looking at the code, -O2, gcc it creates a jump table.

lookup:
movzx edx, BYTE PTR [rdi]
cmp dl, 78
je .L450
sub edx, 95
cmp dl, 24
ja .L81
movzx edx, dl
jmp [QWORD PTR .L5[0+rdx*8]] <<<<<<<<<<
..L5:
.quad .L21
.quad .L81
....

So doesn't matter the order of "ifs" or changing "ifs" by "switch case" the
result is the same. A

David Brown <david.brown@hesbynett.no> wrote:
> On 24/08/2022 01:54, antispam@math.uni.wroc.pl wrote:
> > Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
> >> bart c <bart4858@gmail.com> writes:
> >>> On Tuesday, 23 August 2022 at 20:17:18 UTC+1, Keith Thompson wrote:
> >> [...]
> >>>> And that's a *lot* of "define"s. It sounds like the code you're
> >>>> looking at is not typical. Your numbers don't match what I see
> >>>> in the latest sqlite3 sources.
> >>>
> >>> Yeah, that looked dodgy. I don't know what's going on there. If I
> >>> compile only sqlite3.c, then I get more reasonable looking figures:
> >>>
> >>> 14010 define
> >>> 08728 if
> >>> 08543 int
> >>> 03115 endif
> >>> 03111 return
> >>> 02658 void
> >>> 02291 else
> >>>
> >>> Note that 10,000 of those defines are inside my windows.h. Also note
> >>> that the same 'if' keyword is used for "'#if".
> >>>
> >>> This is specific to normal C however. We don't know exactly what the
> >>> OP is doing, or what his inputs will be, whether they are already
> >>> preprocessed or not.
> >>
> >> Ah, so that's 14010 occurrences of "define" after preprocessing.
> >
> > Bart wrote that "define" came from "windows.h", so his input is
> > before preprocessing.
> >
> >> The point is that if you're processing C source code and recognizing
> >> whether tokens that look like identifiers are keywords or not, it
> >> doesn't make sense to have "define" and "int" in the same table.
> >
> > Why not? Of course logically lookups for preprocessor keywords
> > are different from normal lookups, but there is no law that
> > prevent storing two logical hash tables in single physical
> > hash table. In particular, info about preprocessor keywords
> > is likely to be quite small, so there is good chance that
> > you can put it in place that would be otherwise wasted by
> > padding.
> >
> > If you want fast compiler it makes sense to do symbol
> > table lookup once, that is combine prepocessor with compiler.
> > In preprocessing stage you need to look up each identifier to
> > check if it is a macro, when it is not a macro you may deliver
> > its compilation meaninig. Details needed for correct handling
> > of C are likely to be tricky and I do not know if Bart did
> > this correctly. But this is clearly doable and likely to give
> > large speed advantage.
> >
> > Note that after macros and normal C symbols are in single
> > table what remains is handful of preprocessor keywords.
> > Since they are so frequent and exclusive of normal identifiers
> > special purpose table may give some speed boost. OTOH
> > in combined table preprocessor keywords will be entered
> > first, so with high probablity single lookup will work.
> > In other words, single combined table is likely to behave
> > as perfect hash for the purpose of lookup for proprocessor
> > keywords. It looks tricky to come with faster alternative...
> >
>
> You /could/ put your pre-processor symbols (preprocessor keywords and
> macro identifiers) in the same table as your "main-phase" symbols
> (language keywords and identifiers). But you'd need to distinguish the
> symbol types in the tables - and you need to look up tokens again after
> preprocessing anyway. So I can't see how you could gain anything from
> combining them.
>
> Consider this translation unit :
>
>
> #define foo in
> #define bar t
> #define foobar in ## t
>
> foobar foo, bar, define;
>
>
> This has the same result as "int foo, bar, define;". But there is no
> string "int" in the input text, and the strings in the input change
> meaning radically from before and after pre-processing.

This is not tricky at all. When scanner sees 'foobar' it will do
symbol table lookup. Symbol table contains info that it is a macro
so scanner looks at replacement list. Replacement list contains
symbol table entries for in and t (so no need for lookup). It
also contains marker for pasting. So scanner takes names, creates
pasted name (that is 'int') and performs lookup on this name
getting symbol table entry for 'int' with its predefiend meaning.
Next scanner looks at 'foo' gets it symbol table entry which
again is a macro, from replacement list gets symbol table entry
for 'in' which is not a macro so it get used. Similarly
for 'bar'. Concerning 'define', in first 3 lines scanner
sees '#' so only takes 'preprocessor keyword' part from
symbol table. Final 'define' is not after appropriate '#'
so scanner only looks at macro and C identifier meanings.

Note that pasted tookens are looked up once per expansion,
this is needed for correctness (one could try to optimize,
but it is not clear if gains would justify added complexity).

There are some inconvenient aspects. Normally macro shadow
other meanings, but there are ways to access both (like
'(f)' to get meaning of 'f' as a function), so in general
symbol table must store both macro meaning and other
meaning (if both are present). Structs and unions
require care to get fields when appropriate and outside
meaning otherwise.

--
Waldek Hebisch

Thiago Adams <thiago.adams@gmail.com> writes:

> On Wednesday, August 24, 2022 at 9:28:50 AM UTC-3, Ben Bacarisse wrote:

>> Running on sqlite3's amalgamated source (sqlite-amalgamation-3390200)
>> this "null" version reports 992905 in about 0.072s. (The result,
>> 992905, is of course wrong in this case.)
>>
>> A version using the perfect hash generated by gperf returns 3231512 in
>> about 0.075s and is hard to distinguish from the IO-only run time.
>>
>> A linear search returns 3231512 in 0.24s, and a binary search using C's
>> often overlooked bsearch function returns the same result in 0.1s.
>>
>> All programs compiled and linked with gcc -O3. The lookup function was
>> always in a separate translation unit and is not being inlined.
>
> Can you plug this one and compare? I am not using the length.
> https://godbolt.org/z/1Yh96ch5e

Hmm... it produces a different count. Are the numbers supposed to those
you originally posted? I may be using an old table.

It runs a little slower than the perfect hash: about 0.08s. Both are
close to the IO-only times so the lookup itself it may be significantly
slower, but much more detailed timings would be needed to find out.

I hope the code was automatically generated. It's looks fragile and
very hard to verify!

--
Ben.

On Wednesday, 24 August 2022 at 14:26:07 UTC+1, Thiago Adams wrote:
> On Wednesday, August 24, 2022 at 9:28:50 AM UTC-3, Ben Bacarisse wrote:

> > I wanted to time some algorithms so in case we want some repeatable and
> > comparable timings, here is the driver program I am using:
> >
> > #include <stdio.h>
> > #include <ctype.h>
> >
> > int lookup(const char *tok, unsigned len);
> >
> > int main(void)
> > {
> > int r = 0, c;
> > while ((c = getchar()) != EOF)
> > if (isalpha(c) || c == '_') {
> > unsigned l = 0;
> > char token[20];
> > do token[l++] = c;
> > while (l < sizeof token - 1 &&
> > (isalnum(c = getchar()) || c == '_'));
> > token[l] = 0;
> > r += lookup(token, l);
> > }
> > printf("%d\n", r);
> > }
> >
> > The lookup function returns the token value or -1 for non-keywords.
> >
> > To get an IO and driver overhead, I compiles and linked with
> >
> > int lookup(const char *tok, unsigned l)
> > {
> > return 1;
> > }
> >
> > Running on sqlite3's amalgamated source (sqlite-amalgamation-3390200)
> > this "null" version reports 992905 in about 0.072s. (The result,
> > 992905, is of course wrong in this case.)
> >
> > A version using the perfect hash generated by gperf returns 3231512 in
> > about 0.075s and is hard to distinguish from the IO-only run time.
> >
> > A linear search returns 3231512 in 0.24s, and a binary search using C's
> > often overlooked bsearch function returns the same result in 0.1s.
> >
> > All programs compiled and linked with gcc -O3. The lookup function was
> > always in a separate translation unit and is not being inlined.
> >
> > --
> Can you plug this one and compare? I am not using the length.
> https://godbolt.org/z/1Yh96ch5e
> this one is like:

I tried your code in Ben's program and it's pretty fast.

However, I don't believe it's a good fit within that test program, as that does quite a lot of work in finding the boundaries of each token, using system calls, and extracting the token into a zero-terminated string.

Your lookup method doesn't need all that, it can start work immediately the first alphabetic is detected. But it needs to consume characters from the input ready for the next token. (Personally I would grab the entire input file into an in-memory string, or use another method to map the file so that it can be accessed like a string.)

So it could end up performing better with a customised test program.

A hash-table approach has the same problems in needing to identify the whole token first, /and/ work out its hash value. However it has the potential to be more useful within the bigger program, whatever that happens to do.

On Wednesday, August 24, 2022 at 11:19:34 AM UTC-3, Ben Bacarisse wrote:
> Thiago Adams <thiago...@gmail.com> writes:
>
> > On Wednesday, August 24, 2022 at 9:28:50 AM UTC-3, Ben Bacarisse wrote:
>
> >> Running on sqlite3's amalgamated source (sqlite-amalgamation-3390200)
> >> this "null" version reports 992905 in about 0.072s. (The result,
> >> 992905, is of course wrong in this case.)
> >>
> >> A version using the perfect hash generated by gperf returns 3231512 in
> >> about 0.075s and is hard to distinguish from the IO-only run time.
> >>
> >> A linear search returns 3231512 in 0.24s, and a binary search using C's
> >> often overlooked bsearch function returns the same result in 0.1s.
> >>
> >> All programs compiled and linked with gcc -O3. The lookup function was
> >> always in a separate translation unit and is not being inlined.
> >
> > Can you plug this one and compare? I am not using the length.
> > https://godbolt.org/z/1Yh96ch5e
> Hmm... it produces a different count. Are the numbers supposed to those
> you originally posted? I may be using an old table.

The original was not sorted by "strcmp" and the code I posted is.

sorted by strcmp is this (token is the index, NULL is 0)

"NULL",
"_Alignas",
"_Alignof",
"_Atomic",
"_BitInt",
"_Bool",
"_Complex",
"_Decimal128",
"_Decimal32",
"_Decimal64",
"_Generic",
"_Hashof",
"_Imaginary",
"_Noreturn",
"_Static_assert",
"_Thread_local",
"__alignof",
"__asm",
"__forceinline",
"__inline",
"__int16",
"__int32",
"__int64",
"__int8",
"_asm",
"alignas",
"alignof",
"auto",
"bool",
"break",
"case",
"catch",
"char",
"const",
"constexpr",
"continue",
"default",
"defer",
"do",
"double",
"else",
"enum",
"extern",
"false",
"float",
"for",
"goto",
"if",
"inline",
"int",
"long",
"nullptr",
"register",
"repeat",
"restrict",
"return",
"short",
"signed",
"sizeof",
"static",
"static_assert",
"struct",
"switch",
"thread_local",
"throw",
"true",
"try",
"typedef",
"typeid",
"typeof",
"typeof_unqual",
"union",
"unsigned",
"void",
"volatile",
"while",

On Wednesday, August 24, 2022 at 11:19:34 AM UTC-3, Ben Bacarisse wrote:
> Thiago Adams <thiago...@gmail.com> writes:
....
> I hope the code was automatically generated. It's looks fragile and
> very hard to verify!

Generator:

#include <stdio.h>
#include <string.h>
#include <assert.h>
#include <stdbool.h>

void GenerateCore(const char* keywords[], int first, int last, int level, int* count)
{ int ident = (level + 1) * 2;
for (int i = first; i <= last; i++) {
int begin = i;
int end = begin;
for (int k = i + 1; k <= last; k++) {
if (keywords[k][level] == keywords[begin][level]) {
i++;
end++;
}
else
break;
}
//we have the range
if (begin == end) {
//just one
if (keywords[i][level] != '\0') {
printf("%*cif(*s == '%c') {\n", ident * 2, ' ', keywords[i][level]);
printf("%*cs++;\n", ident * 3, ' ');
printf("%*c/*%s*/\n", ident * 3, ' ', keywords[i]);
printf("%*cif(", ident * 3, ' ');
int len = (int)strlen(keywords[i]);

int j = level + 1;
for (; j < len; j++) {
if (j != level + 1)
printf(" &&");

printf(" *s++ == '%c'", keywords[i][j]);
}
if (j != level + 1) {
printf(" &&");
}
printf(" *s =='\\0') return %i; else return -1;", i);

printf(";\n");
printf("%*c}\n", ident * 2, ' ');
}
else {
printf("%*cif(*s == '\\0') {\n", ident * 2, ' ');
printf("%*cs++;\n", ident * 3, ' ');
printf("%*c/*%s*/\n", ident * 3, ' ', keywords[i]);
printf("%*creturn % i;\n", ident * 3, ' ', i);
printf("%*c};\n", ident * 2, ' ');
}

(*count)++;
}
else {
printf("%*cif(*s == '%c') {\n", ident * 2, ' ', keywords[i][level]);
printf("%*cs++; \n", ident * 3, ' ');
GenerateCore(keywords, begin, end, level + 1, count);
printf("%*creturn -1;\n", ident * 3, ' ');
printf("%*c};\n", ident * 2, ' ');
}
}
}

void Generate(const char* keywords[], int size) {
/*sort keywords*/
int i, j;
for (i = 0; i < size - 1; i++) {
for (j = 0; j < size - i - 1; j++) {
if (strcmp(keywords[j], keywords[j + 1]) > 0) {
char* temp = keywords[j + 1];
keywords[j + 1] = keywords[j];
keywords[j] = temp;
//swap(&arr[j], &arr[j + 1]);
}
}
}

printf("int find(const char* s)\n");
printf("{\n");
int count = 0;
GenerateCore(keywords, 0, size - 1, 0, &count);
printf("%*creturn -1;\n", 2, ' ');
printf("}\n");
}

int main()
{ const char* keywords[] = {
"NULL", "_Alignas", "_Alignof",
"__alignof", "_Atomic", "_BitInt",
"_Bool", "_Complex", "_Decimal128",
"_Decimal32", "_Decimal64", "_Generic",
"_Hashof", "_Imaginary", "_Noreturn",
"_Static_assert", "_Thread_local", "__asm",
"__forceinline", "__inline", "__int16",
"__int32", "__int64", "__int8",
"_asm", "alignas", "alignof", "auto",
"bool", "break", "case", "catch",
"char", "const", "constexpr", "continue",
"default", "defer", "do", "double",
"else", "enum", "extern", "false",
"float", "for", "goto", "if",
"inline", "int", "long", "nullptr",
"register", "repeat", "restrict", "return",
"short", "signed", "sizeof", "static",
"static_assert", "struct", "switch", "thread_local",
"throw", "true", "try", "typedef", "typeid",
"typeof", "typeof_unqual", "union", "unsigned", "void",
"volatile", "while"
};

Generate(keywords, sizeof(keywords) / sizeof(keywords[0]));
}

Thiago Adams <thiago.adams@gmail.com> writes:
>After preprocessor , at first usage, the tokens of type
>"identifier" are checked and promoted to keywords (if there
>are keywords ).

Actually these preprocessing tokens never are converted to
an "identifier", rather, a preprocessing token that could be
converted to either a keyword or an identifier, is converted
to a keyword in translation phase 7.

I find it hard to say whether "if" is an identifier.
It /is/ according to the EBNF syntax "identifier-nondigit |
identifier identifier-nondigit | identifier digit", but
every C implementation is obliged to interpret it as
a keyword and not as an identifier when encountered as
a preprocessing token.

(Strictly from the syntax, we could also say that a program
" ... int main() ..." contains the identifier "ai", because
"ai" is in the program and has the syntax "identifier digit".
It just never becomes a preprocessing token!)

Thiago Adams <thiago.adams@gmail.com> writes:

> On Wednesday, August 24, 2022 at 11:19:34 AM UTC-3, Ben Bacarisse wrote:
>> Thiago Adams <thiago...@gmail.com> writes:
>>
>> > On Wednesday, August 24, 2022 at 9:28:50 AM UTC-3, Ben Bacarisse wrote:
>>
>> >> Running on sqlite3's amalgamated source (sqlite-amalgamation-3390200)
>> >> this "null" version reports 992905 in about 0.072s. (The result,
>> >> 992905, is of course wrong in this case.)
>> >>
>> >> A version using the perfect hash generated by gperf returns 3231512 in
>> >> about 0.075s and is hard to distinguish from the IO-only run time.
>> >>
>> >> A linear search returns 3231512 in 0.24s, and a binary search using C's
>> >> often overlooked bsearch function returns the same result in 0.1s.
>> >>
>> >> All programs compiled and linked with gcc -O3. The lookup function was
>> >> always in a separate translation unit and is not being inlined.
>> >
>> > Can you plug this one and compare? I am not using the length.
>> > https://godbolt.org/z/1Yh96ch5e
>> Hmm... it produces a different count. Are the numbers supposed to those
>> you originally posted? I may be using an old table.
>
> The original was not sorted by "strcmp" and the code I posted is.

Not only have you changed the order (and hence the numbering) you've
added a keyword. That just messes everything up. I'll just assume your
code give the right answer, because it will be a pain to check it.

--
Ben.

Thiago Adams <thiago.adams@gmail.com> writes:

> On Wednesday, August 24, 2022 at 11:19:34 AM UTC-3, Ben Bacarisse wrote:
>> Thiago Adams <thiago...@gmail.com> writes:
> ...
>> I hope the code was automatically generated. It's looks fragile and
>> very hard to verify!
>
> Generator:

That's good. But the generator is not easy to check! I'd go with a
relatively widely-used tool. Especially as it give faster code.

--
Ben.

David Brown <david.brown@hesbynett.no> wrote:
> On 24/08/2022 00:55, Thiago Adams wrote:
>
>>
>> Jus to document I also tried something like:
>>
>> if (ch < 'c')
>> {
>> if (ch == 'a') ...
>> if (ch == 'b') ...
>> }
>> else
>> {
>> if (ch == 'c') ...
>> if (ch == 'd') ...
>> ...
>> }
>>
>> That is similar of binary search .
>> We can add more and more ifs like this.
>>
>
> gcc (and clang, I think) will sometimes generate that kind of structure
> for switch statements, if the cases are too sparse for a jump table to
> be appropriate.

Things may have changed in the past few years, but for a long time gcc and
clang generated jump tables far less often than commonly assumed. Jump
tables may be a classic compiler optimization technique, but on modern CPUs
(with sophisticated branch prediction, etc) it's difficult to guess when a
jump table would be faster than a few conditionals, and gcc and clang seemed
to perhaps overly pessimize them, which is why it was and remains common to
explicitly rely on computed gotos, including computed gotos indexed through
an array, the method recommended in the GCC manual to avoid linking overhead
but which is usually slower than taking and dereferencing the raw label
addresses.

It's not just about sparseness and algorithmic ease of deriving a branch
location, but about how *often* specific cases are taken at runtime. If
certain cases are especially hot, a jump table might not only unnecessarily
incur extra loads, but impede the branch predictor and other machinery. In
microbenchmarks this might not be apparent, but in larger applications where
buffers and pipelines are under pressure, it often does matter.

OTOH, if you know there's a more even distribution of cases taken, such as
with a bytecode interpreter loop, then the tradeoff of a jump table load or
indirect branch on a raw computed goto makes sense--it just wouldn't be
readily apparent to the compiler sans profile-directed feedback.

On Wednesday, 24 August 2022 at 14:45:51 UTC+1, bart c wrote:
> On Wednesday, 24 August 2022 at 13:28:50 UTC+1, Ben Bacarisse wrote:
>
> > I wanted to time some algorithms so in case we want some repeatable and
> > comparable timings, here is the driver program I am using:
> >
> > #include <stdio.h>
> > #include <ctype.h>
> >
> > int lookup(const char *tok, unsigned len);
> >
> > int main(void)
> > {
> > int r = 0, c;
> > while ((c = getchar()) != EOF)
> > if (isalpha(c) || c == '_') {
> > unsigned l = 0;
> > char token[20];
> > do token[l++] = c;
> > while (l < sizeof token - 1 &&
> > (isalnum(c = getchar()) || c == '_'));
> > token[l] = 0;
> > r += lookup(token, l);
> > }
> > printf("%d\n", r);
> > }
> >
> > The lookup function returns the token value or -1 for non-keywords.
> >
> > To get an IO and driver overhead, I compiles and linked with
> >
> > int lookup(const char *tok, unsigned l)
> > {
> > return 1;
> > }
> >
> > Running on sqlite3's amalgamated source (sqlite-amalgamation-3390200)
> > this "null" version reports 992905 in about 0.072s. (The result,
> > 992905, is of course wrong in this case.)
> >
> > A version using the perfect hash generated by gperf returns 3231512 in
> > about 0.075s and is hard to distinguish from the IO-only run time.
> >
> > A linear search returns 3231512 in 0.24s, and a binary search using C's
> > often overlooked bsearch function returns the same result in 0.1s.
> >
> > All programs compiled and linked with gcc -O3. The lookup function was
> > always in a separate translation unit and is not being inlined.
> I tried your program, using Windows 10. I got a base-timing of about 0.2 seconds no matter what compiler was used.

This is apparently due to the poor C library that is common to the compilers I used.

When I loaded the input file into memory and worked from that, it was much faster (similar timings to yours [BB's], even though your machine is likely faster than mine).

But the bulk of the runtime is still extracting the tokens, and only a small part of it is running TA's keyword-checking routine.

(About 0.5 seconds, with a dummy function, and 0.56 seconds with the actual function. But the input is now 10 x sqlite3.c, or 2.2M lines. Around 4Mlps.. Keyword-checking is then around 15M token candidates per second I think, of which only a small fraction will be keywords, and given an already tokenised name to test.)

I'm not sure there's any point in doing more, without knowing more about the actual program this will be used in.

I wrote:

> I will try to gnaw out some time his evening to post my
> solution.

OK, my solution is below. Thiago's in-place initialisation
did not work in my TCC, so I wrote an initialisation
function, kw_init().

#include <stdio.h>
#include <stdlib.h>

/* TODO: allocate tree[] elements from kw_reg() as needed */
int tree[4096][256]; /* tree of character-tables for NFA-like recognition */
int tree_h; /* index of highest used element in tree[] */

/* register a keyword: */
static void kw_reg( char* word, int id )
{ int* ctab;
int next;
char c;

ctab = tree[0];

while( 1 )
{ c = *word;
/* token scanned => store its id and exit: */
if( c == '\0' )
{ ctab[c] = id; break; }
next = ctab[c];
/* known prefix passed => store current character: */
if( next == 0 )
{ ctab[c] = ++tree_h;
ctab = tree[tree_h];
}
/* still within known prefix => continue to next level: */
else
{ ctab = tree[next]; }
word += 1;
}
}

static int kw_check( char* word )
{ char c;
int* ctab;
int next;
int res;

res = -1;
ctab = tree[0];
while( 1 )
{ c = *word;
next = ctab[c];
/* end of token => recognised a keyword: */
if( c == '\0' )
{ res = next; break; }
if( next == 0 ) break; /* unknown prefix => not a keyword */
ctab = tree[next]; /* continue to next level */
word += 1;
}
return res;
}

static void kw_init()
{ tree_h = 0;

kw_reg("NULL" , 0);
kw_reg("_Alignas" , 1);
kw_reg("_Atomic" , 2);
kw_reg("_BitInt" , 3);
kw_reg("_Bool" , 4);
kw_reg("_Complex" , 5);
kw_reg("_Decimal128" , 6);
kw_reg("_Decimal32" , 7);
kw_reg("_Decimal64" , 8);
kw_reg("_Generic" , 9);
kw_reg("_Hashof" , 10);
kw_reg("_Imaginary" , 11);
kw_reg("_Noreturn" , 12);
kw_reg("_Static_assert", 13);
kw_reg("_Thread_local" , 14);
kw_reg("__alignof" , 15);
kw_reg("__asm" , 16);
kw_reg("__forceinline" , 17);
kw_reg("__inline" , 18);
kw_reg("__int16" , 19);
kw_reg("__int32" , 20);
kw_reg("__int64" , 21);
kw_reg("__int8" , 22);
kw_reg("_asm" , 23);
kw_reg("alignas" , 24);
kw_reg("alignof" , 25);
kw_reg("auto" , 26);
kw_reg("bool" , 27);
kw_reg("break" , 28);
kw_reg("case" , 29);
kw_reg("catch" , 30);
kw_reg("char" , 31);
kw_reg("const" , 32);
kw_reg("constexpr" , 33);
kw_reg("continue" , 34);
kw_reg("default" , 35);
kw_reg("defer" , 36);
kw_reg("do" , 37);
kw_reg("double" , 38);
kw_reg("else" , 39);
kw_reg("enum" , 40);
kw_reg("extern" , 41);
kw_reg("false" , 42);
kw_reg("float" , 43);
kw_reg("for" , 44);
kw_reg("goto" , 45);
kw_reg("if" , 46);
kw_reg("inline" , 47);
kw_reg("int" , 48);
kw_reg("long" , 49);
kw_reg("nullptr" , 50);
kw_reg("register" , 51);
kw_reg("repeat" , 52);
kw_reg("restrict" , 53);
kw_reg("return" , 54);
kw_reg("short" , 55);
kw_reg("signed" , 56);
kw_reg("sizeof" , 57);
kw_reg("static" , 58);
kw_reg("static_assert" , 59);
kw_reg("struct" , 60);
kw_reg("switch" , 61);
kw_reg("thread_local" , 62);
kw_reg("throw" , 63);
kw_reg("true" , 64);
kw_reg("try" , 65);
kw_reg("typedef" , 66);
kw_reg("typeid" , 67);
kw_reg("typeof" , 68);
kw_reg("typeof_unqual" , 69);
kw_reg("union" , 70);
kw_reg("unsigned" , 71);
kw_reg("void" , 72);
kw_reg("volatile" , 73);
kw_reg("while" , 74);
}

static void test( char* token )
{ printf("%14s: %2i\n", token, kw_check( token ) ); }

int main( int argc, char** argv )
{ kw_init();
test( "NULL" );
test( "constexpr" );
test( "while" );
test( "_Thread_local" );
test( "random" );
}

Thiago Adams:

> Can you plug this one and compare? I am not using the length.
> https://godbolt.org/z/1Yh96ch5e

The site won't open on my PC with Windows XP. This is
Usenet. Let us post code in our articles so that it is
archived and accessible for everyone with access to Usenet.
With respect to Usenet, the Web is a transcendental entity
and web references are out-of-bandwith.

--
() ascii ribbon campaign -- against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]

Ben Bacarisse:

> I wanted to time some algorithms so in case we want some repeatable and
> comparable timings, here is the driver program I am using:
>
> #include <stdio.h>
> #include <ctype.h>
>
> int lookup(const char *tok, unsigned len);
>
> int main(void)
> {
> int r = 0, c;
> while ((c = getchar()) != EOF)
> if (isalpha(c) || c == '_') {
> unsigned l = 0;
> char token[20];
> do token[l++] = c;
> while (l < sizeof token - 1 &&
> (isalnum(c = getchar()) || c == '_'));
> token[l] = 0;
> r += lookup(token, l);
> }
> printf("\n", r);
> }
>
> The lookup function returns the token value or -1 for non-keywords.
>
> To get an IO and driver overhead, I compiles and linked with
>
> int lookup(const char *tok, unsigned l)
> {
> return 1;
> }

My program for performance testing accepts a set of measured
implementations, including a reference null implementaion,
as function pointers, runs them repeatedly with each dataset
until total running time exceeds a certain limit (e.g. 10
seconds), divides the result by the number of iterations,
and subracts the null time. It has the advantage of testing
all the algorithms against all the datasets in a single
run and of outputting their performace in a table. It uses
standard C fucntions to measure time. How do you measure
time?

Anton Shepelev <anton.txt@gmail.com> writes:

> I wrote:
>
>> I will try to gnaw out some time his evening to post my
>> solution.
>
> OK, my solution is below. Thiago's in-place initialisation
> did not work in my TCC, so I wrote an initialisation
> function, kw_init().

This program reports zero (the token value assigned to NULL) for a
number of strings that are not keywords.

> int main( int argc, char** argv )
> { kw_init();
> test( "NULL" );
> test( "constexpr" );
> test( "while" );
> test( "_Thread_local" );
> test( "random" );
> }

Try, for example, test("rest").

--
Ben.

Anton Shepelev <anton.txt@gmail.com> writes:

> Ben Bacarisse:
>
>> I wanted to time some algorithms so in case we want some repeatable and
>> comparable timings, here is the driver program I am using:
>>
>> #include <stdio.h>
>> #include <ctype.h>
>>
>> int lookup(const char *tok, unsigned len);
>>
>> int main(void)
>> {
>> int r = 0, c;
>> while ((c = getchar()) != EOF)
>> if (isalpha(c) || c == '_') {
>> unsigned l = 0;
>> char token[20];
>> do token[l++] = c;
>> while (l < sizeof token - 1 &&
>> (isalnum(c = getchar()) || c == '_'));
>> token[l] = 0;
>> r += lookup(token, l);
>> }
>> printf("\n", r);
>> }
>>
>> The lookup function returns the token value or -1 for non-keywords.
>>
>> To get an IO and driver overhead, I compiles and linked with
>>
>> int lookup(const char *tok, unsigned l)
>> {
>> return 1;
>> }
>
> My program for performance testing accepts a set of measured
> implementations, including a reference null implementaion,
> as function pointers, runs them repeatedly with each dataset
> until total running time exceeds a certain limit (e.g. 10
> seconds), divides the result by the number of iterations,
> and subracts the null time. It has the advantage of testing
> all the algorithms against all the datasets in a single
> run and of outputting their performace in a table. It uses
> standard C fucntions to measure time. How do you measure
> time?

For the numbers I posted, the time built-in bash command. You code
looks fast, but it seems to have a bug.

What times do you get for the other search strategies?

--
Ben.

On Wednesday, August 24, 2022 at 7:04:36 PM UTC-3, Anton Shepelev wrote:
> Thiago Adams:
> > Can you plug this one and compare? I am not using the length.
> > https://godbolt.org/z/1Yh96ch5e
> The site won't open on my PC with Windows XP. This is
> Usenet. Let us post code in our articles so that it is
> archived and accessible for everyone with access to Usenet.
> With respect to Usenet, the Web is a transcendental entity
> and web references are out-of-bandwith.

This generator outputs the code

#include <stdio.h>
#include <string.h>

void GenerateCore(const char* keywords[], int first, int last, int level, int* count)
{ int ident = (level + 1) * 2;
for (int i = first; i <= last; i++)
{
int begin = i;
int end = begin;
for (int k = i + 1; k <= last; k++)
{
if (keywords[k][level] == keywords[begin][level])
{
i++;
end++;
}
else
break;
}

//we have the range
if (begin == end)
{
//just one
if (keywords[i][level] != '\0')
{
printf("%*cif(*s == '%c') {\n", ident * 2, ' ', keywords[i][level]);
printf("%*cs++;\n", ident * 3, ' ');
printf("%*c/*%s*/\n", ident * 3, ' ', keywords[i]);
printf("%*cif(", ident * 3, ' ');
int len = (int)strlen(keywords[i]);

int j = level + 1;
for (; j < len; j++)
{
if (j != level + 1)
printf(" &&");

printf(" *s++ == '%c'", keywords[i][j]);
}
if (j != level + 1)
{
printf(" &&");

}
printf(" *s =='\\0') return %i; else return -1;", i);

printf(";\n");
printf("%*c}\n", ident * 2, ' ');
}
else
{
printf("%*cif(*s == '\\0') {\n", ident * 2, ' ');
printf("%*cs++;\n", ident * 3, ' ');
printf("%*c/*%s*/\n", ident * 3, ' ', keywords[i]);
printf("%*creturn % i;\n", ident * 3, ' ', i);
printf("%*c};\n", ident * 2, ' ');
}

(*count)++;
}
else
{
printf("%*cif(*s == '%c') {\n", ident * 2, ' ', keywords[i][level]);
printf("%*cs++; \n", ident * 3, ' ');
GenerateCore(keywords, begin, end, level + 1, count);
printf("%*creturn -1;\n", ident * 3, ' ');
printf("%*c};\n", ident * 2, ' ');
}
}
}

void Generate(const char* keywords[], int size)
{ /*sort keywords*/
int i, j;
for (i = 0; i < size - 1; i++) {
for (j = 0; j < size - i - 1; j++) {
if (strcmp(keywords[j], keywords[j + 1]) > 0) {
char* temp = keywords[j + 1];
keywords[j + 1] = keywords[j];
keywords[j] = temp;
}
}
}

printf("int is_keyword(const char* s)\n");
printf("{\n");
int count = 0;
GenerateCore(keywords, 0, size - 1, 0, &count);
printf("%*creturn -1;\n", 2, ' ');
printf("}\n");
}

int main()
{ const char* keywords[] = {
"NULL", "_Alignas", "_Alignof",
"__alignof", "_Atomic", "_BitInt",
"_Bool", "_Complex", "_Decimal128",
"_Decimal32", "_Decimal64", "_Generic",
"_Hashof", "_Imaginary", "_Noreturn",
"_Static_assert", "_Thread_local", "__asm",
"__forceinline", "__inline", "__int16",
"__int32", "__int64", "__int8",
"_asm", "alignas", "alignof", "auto",
"bool", "break", "case", "catch",
"char", "const", "constexpr", "continue",
"default", "defer", "do", "double",
"else", "enum", "extern", "false",
"float", "for", "goto", "if",
"inline", "int", "long", "nullptr",
"register", "repeat", "restrict", "return",
"short", "signed", "sizeof", "static",
"static_assert", "struct", "switch", "thread_local",
"throw", "true", "try", "typedef", "typeid",
"typeof", "typeof_unqual", "union", "unsigned", "void",
"volatile", "while"
};

Generate(keywords, sizeof(keywords) / sizeof(keywords[0]));
}

Staff meeting in the conference room in %d minutes.

devel / comp.lang.c / Re: How to optimize this search?

devel / comp.lang.c / Re: How to optimize this search?

Subject	Author
How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Scott Lurndal
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Scott Lurndal
Re: How to optimize this search?	Anton Shepelev
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Anton Shepelev
Re: How to optimize this search?	Ben Bacarisse
Re: How to optimize this search?	Anton Shepelev
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Anton Shepelev
Re: How to optimize this search?	Ben Bacarisse
Re: How to optimize this search?	Anton Shepelev
Re: How to optimize this search?	Anton Shepelev
Re: How to optimize this search?	Ben Bacarisse
Re: How to optimize this search?	Anton Shepelev
Re: How to optimize this search?	Ben Bacarisse
Re: How to optimize this search?	Anton Shepelev
Re: How to optimize this search?	bart c
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Scott Lurndal
Re: How to optimize this search?	bart c
Re: How to optimize this search?	Keith Thompson
Re: How to optimize this search?	bart c
Re: How to optimize this search?	Ben Bacarisse
Re: How to optimize this search?	bart c
Re: How to optimize this search?	antispam
Re: How to optimize this search?	bart c
Re: How to optimize this search?	antispam
Re: How to optimize this search?	David Brown
Re: How to optimize this search?	bart c
Re: How to optimize this search?	David Brown
Re: How to optimize this search?	antispam
Re: How to optimize this search?	Ben Bacarisse
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Ben Bacarisse
Re: How to optimize this search?	Siri Cruise
Re: How to optimize this search?	Ben Bacarisse
Re: How to optimize this search?	Siri Cruise
Re: How to optimize this search?	Ben Bacarisse
Re: How to optimize this search?	Ben Bacarisse
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	antispam
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Öö Tiib
Re: How to optimize this search?	antispam
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Stefan Ram
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	bart c
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Keith Thompson
Re: How to optimize this search?	bart c
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	David Brown
Re: How to optimize this search?	William Ahern
Re: How to optimize this search?	Bonita Montero
Re: How to optimize this search?	Ben Bacarisse
Re: How to optimize this search?	Bonita Montero
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Bonita Montero
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Bonita Montero
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Bonita Montero
Re: How to optimize this search?	Anton Shepelev
Re: How to optimize this search?	bart c
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	antispam
Re: How to optimize this search?	bart c
Re: How to optimize this search?	Tim Rentsch
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	bart c
Re: How to optimize this search?	Keith Thompson
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	bart c
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Stefan Ram
Re: How to optimize this search?	Keith Thompson
Re: How to optimize this search?	bart c
Re: How to optimize this search?	Keith Thompson
Re: How to optimize this search?	bart c
Re: How to optimize this search?	antispam
Re: How to optimize this search?	Keith Thompson
Re: How to optimize this search?	David Brown
Re: How to optimize this search?	bart c
Re: How to optimize this search?	antispam
Re: How to optimize this search?	Ben Bacarisse
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Ben Bacarisse
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Ben Bacarisse
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Ben Bacarisse
Re: How to optimize this search?	bart c
Re: How to optimize this search?	Anton Shepelev
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	Thiago Adams
Re: How to optimize this search?	bart c
Re: How to optimize this search?	Anton Shepelev