Whitespaces removed after text change delay for syntax highlighting

The title says it all, whenever I type text into my TextField and wait 1 second for the parser (thank you, @player_03 btw) to highlight as required, it removes the whitespaces and I am unsure how to stop either the parser or my code from doing that.

Here is my code:

import openfl.text.Font;
import openfl.display.Sprite;
import openfl.text.TextField;
import openfl.text.TextFormat;
import openfl.Assets;
import openfl.events.Event;
import openfl.events.TextEvent;
import player03.markdownparser.MarkdownParser;
import player03.markdownparser.MarkdownTag;
import openfl.text.TextFieldType;
import openfl.utils.Timer;
import openfl.events.TimerEvent;

class CodeEditor extends Sprite
{

    private var _mainText:TextField;
    private var _lineNumbersText:TextField;
    private var _sourceCode:Font;
    private var _time:Timer;
    
    public function new() 
    {
        super();
        
        _sourceCode = Assets.getFont("font/sourceCode.ttf");
        
        _mainText = new TextField();
        _mainText.defaultTextFormat = new TextFormat(_sourceCode.fontName, 11, 0x000000);
        _mainText.x = x;
        _mainText.y = y + 20;
        _mainText.selectable = true;
        _mainText.embedFonts = true;
        _mainText.wordWrap = true;
        _mainText.multiline = true;
        _mainText.type = TextFieldType.INPUT;
        _mainText.height = 350;
        _mainText.width = 500;
        _mainText.text = "Some text.";
        
        _time = new Timer(1000, 1);
        _time.addEventListener(TimerEvent.TIMER_COMPLETE, textChangeDelayed);
        
        graphics.beginFill(0xDDDDDD, 1);
        graphics.drawRect(x, y, 500, 300);
        
        addEventListener(Event.ADDED_TO_STAGE, init);
        _mainText.addEventListener(TextEvent.TEXT_INPUT, input);
    }
    
    private function init(e:Event):Void
    {
        removeEventListener(Event.ADDED_TO_STAGE, init);
        
        addChild(_mainText);
        
        
    }
    
    private function input(e:TextEvent):Void
    {
        if (_time.running)
            _time.stop();
        _time.start();
        
    }
    
    private function parseAll(parser:MarkdownParser):Void
    {
        parser.parse(_mainText.text).apply(_mainText);
    }
    
    private function textChangeDelayed(e:TimerEvent):Void
    {
        var parser:MarkdownParser = new MarkdownParser(
            [
                new MarkdownTag("Common Keywords", 
                    "(super|private|public|inline|var|class|new|function|extends|implements|static|if|else|try|catch|switch|case|break|continue|for|while) ",
                    new TextFormat(_sourceCode.fontName, 11, 0x0000FF))
            ]
        );
        
        parseAll(parser);
        _time.reset();
    }
    
}

Here is the result before the parse:

and after:

I can’t think of why it would do that in the first place, maybe it is something to do with how the parser matches patterns, so it may require tweaking maybe? The parser, by the way, can be found here.

Thank you for any help in solving this matter.

It’s because the space isn’t inside the parentheses. Everything outside the parentheses is normally stripped.

For instance, one of the default expressions is __(.+?)__, which will match things such as “__sample text__” and turn it into “sample text”.

Unfortunately, putting the space inside the parentheses will cause a stack overflow, because the parser checks tags recursively. If no text gets removed, it’ll just match the same thing again.

I don’t think this is something to be fixed; I think it’s a misuse of the library. The library is meant to look for excess characters, remove them, and apply appropriate formatting.

In fact, a class that didn’t need to remove anything would be a lot simpler:

class KeywordHighlighter {
    private var matcher:EReg;
    private var format:TextFormat;
    
    public function new(keywords:Array<String>, color:Int) {
        matcher = new EReg("\\b(?:" + keywords.join("|") + ")\\b", "");
        format = new TextFormat(null, null, color);
    }
    
    public function highlightKeywords(textField:TextField):Void {
        var text:String = textField.text;
        
        while(matcher.match(text)) {
            var start:Int = matcher.matchedLeft().length + matcher.matchedPos().pos;
            var end:Int = start + matcher.matchedPos().len;
            textField.setTextFormat(format, start, end);
            
            text = matcher.matchedRight();
        }
    }
}

Usage:

private function textChangeDelayed(e:TimerEvent):Void
{
    var highlighter:KeywordHighlighter = new KeywordHighlighter(
        ["super", "private", "public", "inline", "var", "class",
        "new", "function", "extends", "implements", "static", "if",
        "else", "try", "catch", "switch", "case", "break", "continue",
        "for", "while"], 0x0000FF);
    
    highlighter.highlightKeywords(_mainText);
    
    _time.reset();
}

Thank you for your help once again! Of course, with syntax highlighting it’s going to be a long trek, so this is going to take a while!

Oops, messed up the algorithm. Use this instead:

class KeywordHighlighter {
    private var matcher:EReg;
    private var format:TextFormat;
    
    public function new(keywords:Array<String>, color:Int) {
        matcher = new EReg("\\b(" + keywords.join("|") + ")\\b", "");
        format = new TextFormat(null, null, color);
    }
    
    public function highlightKeywords(textField:TextField):Void {
        var text:String = textField.text;
        var start:Int = 0;
        var end:Int = 0;
        
        while(matcher.match(text)) {
            start = end + matcher.matchedLeft().length;
            end = start + matcher.matchedPos().len;
            textField.setTextFormat(format, start, end);
            
            text = matcher.matchedRight();
        }
    }
}

Also be sure you clear all formatting before calling highlightKeywords(). Otherwise, words will stay blue even if they’re no longer keywords.

1 Like

I think I’m getting the hang of regular expressions. Using this to match a string:

var stringPattern:EReg = ~/".+?([^\\"]|\\".+?)*"|'.+?([^\\']|\\'.+?)*'|""|''/;
var stringColor:Int = 0xA06909;

This is going to be for an open source video game editor (OpenFL and the likes) so would be interesting to see where this goes.

I forget if I referred you to this site before, but http://www.regextester.com/ is an excellent way to create and debug regular expressions.

Using this, I was able to simplify your regex down to ("|')((?:(?!\\\1).|\\\1)*?)\1. As a bonus, if it matches a string, the contents of the string will always be placed in group 2. This includes empty strings, in which case group 2 will itself be an empty string.

To see how it works, paste it into the tester and mouse over each part of the expression. (Their explanation of negative lookaheads isn’t very good, so look here instead.)