Factor/To do/Compiler

high level optimizer

string-nth 123 eq? -- use string-nth-fast here
speed up binary-reduce
compile range iteration better
fix predicate propagation
[ 1 2 <cursor> ] [ f ] if dup [ A ] [ B ] if we need better constraints so that A has slots
loop inversion
perhaps interval inference is unsound for fixnum+fast etc?
branch fusion
growable resize computation may overflow with 3 but not 2 1+
>fixnum dup { [ 0 >= ] [ 0 <= ] } && [ 1 + ] when -- we don't optimize this

if we change =, we can do this,

[ dup 0.0 eq? [ dup 0 eq? [ ] [ ] if ] [ ] if ] should not optimize
[ dup 0.0 eq? [ dup 0 = [ ] [ ] if ] [ ] if ] should
[ dup 0.0 = [ dup 0 eq? [ ] [ ] if ] [ ] if ] should not optimize
[ dup 0.0 = [ dup 0 = [ ] [ ] if ] [ ] if ] should
[ dup 0.0 = [ 3 * ] when ] should
[ dup 0.0 number= [ 3 * ] when ] should

peg.javascript.parser take long time to load
remove-mixin-instance takes a long time
SPECIAL: foo -- compile versions of foo for different input classes
[ { fixnum } declare 3 shift >fixnum ] optimized. should optimize down to fixnum-shift-fast
2 <sliced-groups> [ ... ] assoc-each -- first2 has a bounds check, kill it

done:

integer-integer partial ops should use both-fixnums?
partial dispatch ops should be flushable, and should have input-classes decls
optimize out both-fixnums? where possible
inlining heuristic: number of usages within this word
we want some way to define HINTS: for literal values
use the specialized-def when inlining?
above two items are motivated by a desire to specialize <float-array> for small arrays, to speed up raytracer and nbody
fixnum/i --> fixnum/i-fast if denominator is positive!
[ <=> ] sort [ <=> ] sort [ <=> ] sort -- still doesn't scale linearly
don't inline push, unless we have a non-vector, pointless
remove support for 'f' from intervals
add identities for integer math

low level optimizer

VN should use identity comparison for ##load-literal instead of value equality
combine VN with constant branch folding as a first step towards global optimization?
once low level IR is SSA, we don't have to spill a live interval which has already been spilled once
tag { ... } dispatch: we can eliminate a shift instruction there
the tagged comparison optimization is not actually sound
2 swap fixnum+fast
2 fixnum+fast 1 fixnum+fast

done:

powerpc: intrinsics for fixnum+, fixnum-, fixnum
fixnum+ doesn't have an intrinsic anymore
powerpc: clean up stack-frame code and test spilling
powerpc: enable float intrinsics
dispatch: there's a potential jump to a jump here
linux-ppc: save/restore nonvolatile fp registers
dispatch branch alignment off
spilling: align stack on windows (or use unaligned load/store insns?), leave room in stack frame for integer and float spills, resolve pass
need an optimizer pass to get rid of jumps to returns and jumps to jumps

This revision created on Mon, 16 Mar 2009 05:53:13 by slava

Contents

Factor/To do/Compiler

high level optimizer

low level optimizer