Reptorian

joined 1 year ago
[–] [email protected] 2 points 5 months ago

For small projects, rewriting is often superb. It allows us to reorganize a mess, apply new knowledge, add neat features and doodads, etc.

This. I'm coding to contribute to a open-source software with very small amount of coders, and with a non-mainstream Domain-Specific Language. A lot of the code I did before has been proven to work from times to time, but they all could benefit from better outputs and better GUI. So, I end up reengineering the entire and that'll take a really long time, however, I do a lot of tests to ensure it works.

[–] [email protected] 2 points 5 months ago

I have to say, I really like the concept behind this. May be another tool for parsing strings I have besides Python.

[–] [email protected] 1 points 5 months ago* (last edited 5 months ago)

I don’t understand your problem well enough to know, if you can (or want to) use this here, but you might be able to tap into that C performance with the radix conversion formatting of printf.

The problem is printing big binary to decimal. That's not a easy problem because 10 is not a power 2. If we live in a base-hex world, this would be very easy to solve in O(n).

Also, I can't access that as G'MIC is a language that can't really communicate with other language as it's not meant to share memory.

[–] [email protected] 2 points 5 months ago

This could be an XY problem, that is, you’re trying to solve problem X, rather than the underlying problem Y. Y here being: Why do you need things to be in decimal in the first place?

I wouldn't say it's needed, but this is more of a fun thing for me. The only thing I'm using this is for Tupper's Self-Referential formula, and my current approach of converting base 1>>24 to base 1e7 works instantly for 106x17 binary digits. When I load a image to that filter that's greater than somewhere over 256x256, delays are noticeable because the underlying algorithm isn't that great, but it could have to do with the fact that G'MIC is interpretative, and despite the JIT support in it, this is not the kind of problem it's meant to solve (Domain-Specific). On the bright side of thing, this algorithm will work with any data type as long as one data type is one level higher than the other, and in this case, I'm using the lowest level (single and double), and the bigger data type, much faster it can be.

 

At the moment, I am stuck with using single-precision float, and double-precision float. So, the maximum represent-able value for single is 1<<24 while for double, it is 1<<53.

Because of this, I made the following script here - https://gist.github.com/Reptorian1125/71e3eec41e44e2e3d896a10f2a51448e .

Allow me to clarify on the script above. On the first part, rep_bin2dec does is to return the converted values into the status. So, when I do ${} or variable=${rep_bin2dec\ ???}, I get the status string.

On the second part, rep_bin2dec_base is the basis for getting rep_bin2dec to work. _rep_bin2dec_base prints the base_10M array into a string.

So, how does rep_bin2dec_base converts a big binary into big decimal?

  1. If the binary image is less than dimension of 54, then the script will use 0b{} which allows me to directly convert binary to decimal, and 0b is a binary literal much in the same way that Python and C++ does it. From this point, it's pretty obvious on what to do there. So, if it less than dimension 54, this step 1 is pretty much done. If not, move on to step 2.

  2. Convert the binary image as a image of base (1<<24) representing the value of that image. Note that there are two channels "[ output_value , y ]". y in this case represents the digit position in base (1<<24).

  3. Make the converted image as a dynamic array image. This allows us to remove unused digits. You can look at step 2, and step 3 as converting a binary string into an array of base (1<<24) into a dynamic array. Also, note that start_value is stored. That's the very first digit.

  4. Note that the number_of_decimals is the predicted number of characters after conversion of binary to decimal. And the, there's multi-threading that gets activated depending on the size of dynamic array image. decimal_conversion_array_size,result_conversion_array_size is used to define array size as they're temporary arrays to convert from base (1<<24) into base 10M. Finally, there's a new image which is going to be using base 10 million for easy printing, and set is used to add the first digit of base (1<<24) which will then be converted to base 10M.

  5. On eval[-2], we are now processing the base (1<<24) image, and then convert it into base 10M. There's a implicit loop, so you can add a "for y" after begin(), and begin() can be seen as the setup code.

Some notes, copy() basically allows me to alter an array. In this case, opacity is negative, so it will add the multiplication of the positive opacity. If opacity was between 0-1, then it will get treated similar to how opacity of one layer alters a image. And the multiplication algorithm being used to convert between bases is Schönhage-Strassen multiplication, but without the FFT part.

So, here how that works.

   9   9
x  1   9
_________
  81  81
9  9
_________
1  8  8 1

Basically, it's long multiplication, and you can see that there's carrying of the remainder. 81 -> 1 (Remainder 8). 81 + 9 + R8 = 89 + 9 = 8 R ( 1+ 8 ) = 8 R 9. Then 9 + 9 is 18. So, you can see how this results in 1881.

  1. After the conversion to base 10M, depending on your inputs, it'll set the status value to the decimal representation or preserves it as a base 10M for easy printing with _rep_bin2dec_base after alteration.

There's some more details, but I find it really hard to explain this.

So, my question is what are some good algorithm to print out huge binaries as decimal? I know Python is insanely good at that, but I can't seem to understand how it does that so well. I know that they do involve conversion to base 2^30 or 1<<30.

At the moment, I can convert a 90000 digits binary in .35 s, and that's bad to what I seen in Python. It's really bad with 1M binary digits.

[–] [email protected] 2 points 5 months ago

Even simpler is repeat 10 { }

} just stands for done.

[–] [email protected] 1 points 5 months ago* (last edited 5 months ago) (1 children)

I don't think we do have a difference in opinion. What I'm saying is that some apps are done with many years of development, and in those case, C++ will likely be the only realistic option because it is way more time-consuming to switch. For example, Krita. I do agree that when there's a choice, C++ is less relevant these day.

[–] [email protected] 3 points 5 months ago (3 children)

C++ is still used for some popular applications, and it still is the only realistic option for these ones. I think there should be more Domain-Specific Languages. I want one for vector graphics like G'MIC is for raster graphics.

[–] [email protected] 2 points 6 months ago

I been meaning to learn Ruby to get around using Python. I like Ruby syntax better.

[–] [email protected] 3 points 7 months ago* (last edited 7 months ago) (1 children)

Coming from some one who used 4 different languages (C#, C++, Python, and G'MIC), I just feel more comfortable when there's a explicit end blocks, which is why I don't like Python. Of all of those languages, only Python does not make that explicit end block which is off-putting in my opinion, and there isn't any other options with the similar role to Python.

[–] [email protected] 6 points 7 months ago (16 children)

You mean a interpretative language with similar role to Python, but more like Rust/C++ style? I actually want that so that I can ditch Python even if I learned it and use this instead.

[–] [email protected] 2 points 7 months ago

This is great, even though if I code in Python, I'm not using it for performance reason, but for convenience.

[–] [email protected] 2 points 7 months ago* (last edited 7 months ago)

It's a bit of a pain to finish, but I'm basically working on creating an array of numbers to assist in sorting unicode characters, and I'm making string processing commands for the G'MIC scripting language. So, that means by hand, I have to sort hundreds of thousands of characters, and I sorted tens of thousands of them already. I already did string_permutations and you can find string_permutations at index or find index which that permutation can be found. However those commands needs the array of numbers for an additional sorting option I'll do.

 

Three things before I'll get to the relevant details.

  1. Brainfuck is a esoteric languages which uses 8 characters. I'll leave details here - https://en.wikipedia.org/wiki/Brainfuck
  2. G'MIC is a language largely inspired by bash languages and one other shell scripting language, and partly inspired by C++ for JIT compilation. It's two languages in one as in one outside of JIT and one inside of JIT. It's main purpose is image processing, and it can do 3D things too, basically image-related things. It's turing-complete, so making files has been done with it. Even making a executable compiled program is possible in practice (but, I would point to doing C++ and compile there instead).
  3. I am a G'MIC filters developer.

Anyways, I taken some time to code up a Brainfuck interpreter within G'MIC. It wasn't that hard to do once I understood what Brainfuck is as a language. I did one earlier than this, but I had to have users define inputs beforehand. Recently, I created rep_cin command to relieve users of doing that, and that is the closest to input() within Python or std::cin via C++.

Anyways, here's the code to my Brainfuck interpreter:

#@cli run_brainfuck_it: brainfuck_file,'_enforce_numbers_input={ 0=false | 1=true },_size_of_array>0
#@cli : Interprets Brainfuck code file within G'MIC brainfuck_interpreter.
#@cli : Default values: ,'_enforce_numbers_input=0','_size_of_array=512'
run_brainfuck_it:
    skip ${2=0},${3=512}
    it $1
    _brainfuck_interpreter $2,$3
    um run_brainfuck_it,run_brainfuck,_brainfuck_interpreter,_brainfuck_interpreter_byte_input
#@cli run_brainfuck: brainfuck_code,'_enforce_numbers_input={ 0=false | 1=true },_size_of_array>0
#@cli : Interprets Brainfuck code within G'MIC brainfuck_interpreter.
#@cli : Default values: ,'_enforce_numbers_input=0','_size_of_array=512'
run_brainfuck:
    skip ${2=0},${3=512}
    ('$1')
    _brainfuck_interpreter $2,$3
    um run_brainfuck_it,run_brainfuck,_brainfuck_interpreter,_brainfuck_interpreter_byte_input
_brainfuck_interpreter:
    # 1. Convert image into dynamic image
    resize 1,{whd#-1},1,1,-1 ({h}) append y # Convert string images into dynamic image
    name[-1] brainfuck_code                 # Name image into brainfuck_code

    # 2. Remove unused characters
    eval "
        const brainfuck_code=$brainfuck_code;
        for(p=h#brainfuck_code-2,p>-1,--p,
            char=i[#brainfuck_code,p];
            if(!(inrange(char,_'+',_'.',1,1)||(find('<>[]',char,0,1)!=-1)),
                da_remove(#brainfuck_code,p);
            );
        );
        if(!da_size(#brainfuck_code),
            run('error inval_code');
        );
        da_freeze(#brainfuck_code);
        "

    # 3. Evaluate brackets
    eval[brainfuck_code] >"
        begin(level=0;);
        i==_'['?++level:
        i==_']'?--level;
        if(level<0,run('error inv_bracks'););
        end(if(level,run('error inv_bracks');););"

    1x2  # Create 2 images of 1x1x1x1. One image is for storing print out characters, and the other is to allow inputs.
    _arg_level=1

    # 4. Create JIT code for executing brainfuck code.
    repeat h#$brainfuck_code {
        idx:=i[#0,$>]

        if $idx==_',' code_str.=run('$0_byte_input[-2]\ $1');ind_list[ind]=i#-2;                continue fi
        if $idx==_'.' code_str.=da_push(#-1,ind_list[ind]);                                     continue fi
        if $idx==_'+' code_str.=ind_list[ind]++;ind_list[ind]%=256;                             continue fi
        if $idx==_'-' code_str.=ind_list[ind]--;ind_list[ind]%=256;                             continue fi
        if $idx==_'<' code_str.=if(!inrange(--ind,0,$2,1,0),run("'error out_of_bound'"););      continue fi
        if $idx==_'>' code_str.=if(!inrange(++ind,0,$2,1,0),run("'error out_of_bound'"););      continue fi
        if $idx==_'[' code_str.=repeat(inf,if(!ind_list[ind],break(););                         continue fi
        if $idx==_']' code_str.=);                                                                       fi
    }

    # 5. Execute created JIT code. v + and v - is used to change verbosity level, not part of JIT execution. e[] is used to print into console.
    v +
    eval >begin(ind=0;ind_list=vector$2(););$code_str;end(da_freeze(#-1););
    v -

    # 6. Print out executed code result
    v + e[$^] "Brainfuck Output: "{t} v -
    remove
_brainfuck_interpreter_byte_input:
    repeat inf {
        wait         # For some reason, I had to add this to make this code work!

        if $> rep_cin "Brainfuck Interpreter - Wrong Input! Insert Integer for Argument#"$_arg_level": "
        else  rep_cin "Brainfuck Interpreter - Enter Argument#"$_arg_level" (Integers Only): "
        fi

        if $1 input:=(${}%208)+_'0'
        else  input=${}
        fi

        if isint($input) break fi
    }

    if $1 
        v + e[$^] "Brainfuck Interpreter Inserted Argument#"$_arg_level": "{$input-_'0'} v -
    else
        input%=256
        v + e[$^] "Brainfuck Interpreter Inserted Argument#"$_arg_level": "$input" :: "{`$input`} v -
    fi

    _arg_level+=1
    f[-1] $input

And the CLI test:

C:\Users\User\Documents\G'MIC\Brainfuck Interpreter>gmic "brainfuck_interpreter.gmic" run_brainfuck \">,>,<<++++++[>-------->--------<<-]>[>[>+>+<<-]>[<+>-]<<-]>[-]>+>>++++++++++<[->-[>>>]++++++++++<<+[<<<]>>>>]<-<++++++++++>>>[-<<<->>>]<<<<++++++[>++++++++>[++++++++>]<[<]>-]>>[.<<]<[<<]>>.\",1
[gmic]./ Start G'MIC interpreter (v.3.3.3).
[gmic]./ Input custom command file 'brainfuck_interpreter.gmic' (4 new, total: 4806).
[gmic]./ Brainfuck Interpreter Inserted Argument#1: 31
[gmic]./ Brainfuck Interpreter Inserted Argument#2: 3
[gmic]./ Brainfuck Output: 93
[gmic]./ End G'MIC interpreter.
 

Basically just the title said. The situation is basically I use a Domain-Specific Language called G'MIC, and to this day, I haven't found a satisfactory answer to the issue of lack of syntax highlighting. At the moment, I am using KDE Kate as it's pretty good at structuring the code with their find/replace feature, tab indicators, and multi-window support.

view more: next ›