Factor/To do/Compiler
high level optimizer
-
string-nth 123 eq?
-- use string-nth-fast here - speed up binary-reduce
- compile range iteration better
- fix predicate propagation
-
[ 1 2 <cursor> ] [ f ] if dup [ A ] [ B ] if
we need better constraints so that A has slots - loop inversion
- perhaps interval inference is unsound for fixnum+fast etc?
- branch fusion
- growable resize computation may overflow with
3
but not 2 1+
-
>fixnum dup { [ 0 >= ] [ 0 <= ] } && [ 1 + ] when
-- we don't optimize this
if we change =, we can do this,
[ dup 0.0 eq? [ dup 0 eq? [ ] [ ] if ] [ ] if ] should not optimize
[ dup 0.0 eq? [ dup 0 = [ ] [ ] if ] [ ] if ] should
[ dup 0.0 = [ dup 0 eq? [ ] [ ] if ] [ ] if ] should not optimize
[ dup 0.0 = [ dup 0 = [ ] [ ] if ] [ ] if ] should
[ dup 0.0 = [ 3 * ] when ] should
[ dup 0.0 number= [ 3 * ] when ] should
- peg.javascript.parser take long time to load
- remove-mixin-instance takes a long time
-
SPECIAL: foo
-- compile versions of foo for different input classes -
[ { fixnum } declare 3 shift >fixnum ] optimized.
should optimize down to fixnum-shift-fast
-
2 <sliced-groups> [ ... ] assoc-each
-- first2 has a bounds check, kill it
done:
- integer-integer partial ops should use
both-fixnums?
- partial dispatch ops should be flushable, and should have input-classes decls
- optimize out
both-fixnums?
where possible - inlining heuristic: number of usages within this word
- we want some way to define
HINTS:
for literal values - use the
specialized-def
when inlining? - above two items are motivated by a desire to specialize
<float-array>
for small arrays, to speed up raytracer and nbody - fixnum/i --> fixnum/i-fast if denominator is positive!
-
[ <=> ] sort [ <=> ] sort [ <=> ] sort
-- still doesn't scale linearly - don't inline push, unless we have a non-vector, pointless
- remove support for 'f' from intervals
- add identities for integer math
low level optimizer
- VN should use identity comparison for ##load-literal instead of value equality
- combine VN with constant branch folding as a first step towards global optimization?
- once low level IR is SSA, we don't have to spill a live interval which has already been spilled once
-
tag { ... } dispatch
: we can eliminate a shift instruction there - the tagged comparison optimization is not actually sound
-
2 swap fixnum+fast
-
2 fixnum+fast 1 fixnum+fast
done:
- powerpc: intrinsics for
fixnum+
, fixnum-
, fixnum
-
fixnum+
doesn't have an intrinsic anymore - powerpc: clean up stack-frame code and test spilling
- powerpc: enable float intrinsics
- dispatch: there's a potential jump to a jump here
- linux-ppc: save/restore nonvolatile fp registers
- dispatch branch alignment off
- spilling: align stack on windows (or use unaligned load/store insns?), leave room in stack frame for integer and float spills, resolve pass
- need an optimizer pass to get rid of jumps to returns and jumps to jumps
This revision created on Mon, 16 Mar 2009 05:53:13 by slava