Arc Forumnew | comments | leaders | submitlogin
3 points by almkglor 6089 days ago | link | parent

>> we've gotta make ourselves one of those nifty profilers for Arc.

True, true. I was optimizing random functions in treeparse, but didn't get a boost in speed until you suggested optimizing 'many.



1 point by almkglor 6089 days ago | link

It seems that 'many is the low-hanging fruit of optimization. I've since gotten an 8-paragraph lorem ipsum piece, totalling about 5k, which renders in 3-4 seconds (about around 3800msec).

Hmm. Profiler.

I'm not 100% sure but maybe the fact that nearly all the composing parsers decompose the return value of sub-parsers, then recompose the return value, might be slowing it down? Maybe have parsers accept an optional return value argument, which 'return will fill in (instead of creating its own) might reduce significantly the memory consumption (assuming it's GC which is slowing it down)?

mockup:

  (def parser-function (remaining (o retval (list nil nil nil)))
    (....)
    (return parsed li actions retval))

  (def many-r (parser remaining acc act-acc (o retval (list nil nil nil)))
      (while (parse parser remaining retval)
        ; parsed
        (lconc acc (copy (car retval)))
        ; actions
        (lconc act-acc (copy (car:cdr:cdr retval)))
        (= remaining (car:cdr scratch)))
      (return (car acc) remaining (car act-acc) retval))
Removing 'actions might help too - we can now use just a plain 'cons cell, with car == parsed and cdr == remaining.

-----

1 point by raymyers 6089 days ago | link

I tried taking out actions for the heck of it. Removing them yields roughly a 30% speed increase on this benchmark:

  (time (do ((many anything) (range 1 5000)) nil))
Using the following method, we can keep actions as a feature but still get the 30% speedup when we don't use them.

  (def many (parser)
    "Parser is repeated zero or more times."
    (fn (remaining) (many-r parser remaining (tconc-new) nil)))

  (def many-r (parser li acc act-acc)
    (iflet (parsed remaining actions) (parse parser li)
           (many-r parser remaining
                   (lconc acc (copy parsed))
                   (if actions (join act-acc actions) act-acc))
           (return (car acc) li act-acc)))
Not bad, but still not as fast as we'd want for processing wiki formatting on the fly...

ed: Yes. act-acc, not (car act-acc).

-----

1 point by almkglor 6089 days ago | link

Hmm. If you remove 'actions, how about also trying to use just a single 'cons cell:

  (iflet (parsed . remaining) (parse parser remaining)
    ...)

  (def return (parsed remaining)
    (cons parsed remaining))
?

If the speed increase is that large on that testbench, it might very well be due to garbage collection.

This might be an interesting page for our problem here ^^

http://www.valuedlessons.com/2008/03/why-are-my-monads-so-sl...

-----

1 point by raymyers 6089 days ago | link

Tried changing the list to a single cons cell. I did not see any additional performance boost.

-----

1 point by almkglor 6089 days ago | link

  (def many-r (parser li acc act-acc)
    (iflet (parsed remaining actions) (parse parser li)
           (many-r parser remaining
                   (lconc acc (copy parsed))
                   (if actions (join act-acc actions) act-acc))
           (return (car acc) li (car act-acc))))
s/(car act-acc)/act-acc maybe?

Personally I don't mind losing 'actions, it does seem that 'filt would be better ^^.

-----

1 point by almkglor 6089 days ago | link

I tested this on my 8-paragraph 5000-char lorem ipsum page, and the run dropped down to about 3400msec (from 3800 msec).

Hmm. Not sure where the slow down is now ^^

I've tried my "retval" suggestion and it's actually slower, not faster. So much for not creating new objects T.T;

-----

1 point by almkglor 6089 days ago | link

Arrg, I've built a sort-of profiler for the wiki, it's on the git, to enable just look for the line:

  (= *wiki-profiling-on nil)
And change it to t, then reload Arki to turn it on. Then use (*wiki-profile-print) to print out the profile report.

Note that turning on profiling increases time by a factor of > 5. Don't use unless desperate.

Anyway a sample run - this page was rendered in about 800msec without profiling, with profiling it took about 5150msec:

  bold: 305
  bolded-text: 914
  nowiki-e: 31
  open-br: 305
  seq-r: 2681
  italicized-text: 793
  many-format: 4830
  plain-wiki-link: 841
  alt-r: 4555
  nowiki-text: 584
  nowiki: 184
  joined-wiki-link: 413
  ampersand-coded-text: 486
  ampersand-codes: 171
  italics: 128
  many-r: 4829
  formatting: 4616
  close-br: 39
Note that the timing will not be very accurate or particularly useful IMO, since it doesn't count recursion but does count calls to other functions. Sigh. We need a real profiler ^^

-----

2 points by raymyers 6088 days ago | link

>> Sigh. We need a real profiler ^^

Maybe this'll help. http://www.arclanguage.org/item?id=5318

-----

1 point by raymyers 6088 days ago | link

Knowing a bit about the call hierarchy, maybe we can squeeze a bit more knowledge out of that. Here's what seems to be going on:

    formatting               4616
    [-] alt-r                4555
     | bolded-text            914
     | plain-wiki-link        841
     | italicized-text        793
     | nowiki-text            584
     | ampersand-coded-text   486
     | joined-wiki-link       413

-----

1 point by almkglor 6088 days ago | link

Hmm, then the total time of alt-r's children is 4031, leaving 524 msec in alt-r itself.

My test page has quite a bit of bolded text (for testing), so I suppose it's the reason why bolded-text is the highest. Hmm.

Anyway I'm thinking of adding the following parser to the top of the big 'alt structure in formatting:

  (= plain-text
    (pred [or (alphadig _) (whitec _) (in _ #\. #\,)] anything))

  (= formatting
    (alt
      plain-text
      ...))
Hmm. It seems we can't squeeze much performance out of 'alt, I can't really see a way of optimizing 'alt itself, so possibly we should optimize the grammar that uses 'alt.

-----

1 point by almkglor 6088 days ago | link

Did something highly similar to this, it reduced my 8-paragraph lorem ipsum time from about 3200msec to 2100msec.

-----