Improving fuzzy match

The current fuzzy matching algorithm is not good for efficiency compared to those in helm. Copied from my IRC messages:

  1. substring-norm is kinda adhoc, and need to type spaces to get
    reasonable result
  2. DL-distance is in principle very good, however there’s a fatal
    issue: it is a symmetric metric that also allows deletion operation
  3. Seems that command prompt give similar weight to both command name
    and docstring, this basically makes any command with (a few)
    occurrence of “buffer” in docstring outscores switch-buffer
    and because DL distance allow deletion, not mentioning “switch” at all
    doesn’t really hurt their score that much.
    I think fuzzy search as typically used in search engines and used
    in Emacs is fundamentally different – the former is expected to
    correct error (like single character substitution), while the latter
    is expected to improve efficiency.

I posted a simple fuzzy match algorithm at simple flex score function for nyxt · GitHub
According to feedback I received currently, it’s better than the current default.

Should we provide it as the new default for Nyxt? We can also purge the mk-string-metrics dependency if we do so.

1 Like

I haven’t yet tried using your function, but I want to — I’ve been noticing quirks of the current one for too long! Improving fuzzy matching sounds like a welcome change :slight_smile:

I suggest you open a pull request on Github. Given the benefits of your function, I’m pretty sure we’ll merge it! Even if we don’t, it’s still worth discussing the fuzzy matching there :upside_down_face:

1 Like

Sounds good! I ask here before making any PR because at least in some projects people are reluctant to replace any previous things. In our case there could (hypothetically) be reasons that some users still want the old algorithm, although IMO my version is more handy in every case I tried. I’ll make a PR then!

Yes, it does make sense.

We are open to any suggestions and PRs, if those are sane and improve Nyxt :smiley:

Regarding old algorithms: better defaults and full configurability go hand in hand in our case. If anyone doesn’t like the default behavior, they can extend Nyxt to behave the way they want. That’s exactly what you did :slight_smile: