Factor/To do/Compiler
high level optimizer
- [{18:05 < OneEyed> : foo ( x -- y ) 1 + ;
18:05 < OneEyed> HINTS: foo { fixnum } ;
18:05 < OneEyed> : foo ( -- x ) 1 ;}]
- beust2: if we can infer that first is in [0,1], value is a fixnum, and remaining is a fixnum, then all generic arithmetic goes away
- constraints for tuple slots
-
string-nth 123 eq?
-- use string-nth-fast here - speed up binary-reduce
- compile range iteration better
-
[ 1 2 <cursor> ] [ f ] if dup [ A ] [ B ] if
we need better constraints so that A has slots - loop inversion
- growable resize computation may overflow with
3
but not 2 1+
-
>fixnum dup { [ 0 >= ] [ 0 <= ] } && [ 1 + ] when
-- we don't optimize this - try to infer that
buffer-capacity
returns a fixnum
if we change =, we can do this,
[ dup 0.0 eq? [ dup 0 eq? [ ] [ ] if ] [ ] if ] should not optimize
[ dup 0.0 eq? [ dup 0 = [ ] [ ] if ] [ ] if ] should
[ dup 0.0 = [ dup 0 eq? [ ] [ ] if ] [ ] if ] should not optimize
[ dup 0.0 = [ dup 0 = [ ] [ ] if ] [ ] if ] should
[ dup 0.0 = [ 3 * ] when ] should
[ dup 0.0 number= [ 3 * ] when ] should
-
SPECIAL: foo
-- compile versions of foo for different input classes -
[ { fixnum } declare 3 shift >fixnum ] optimized.
should optimize down to fixnum-shift-fast
-
2 <sliced-groups> [ ... ] assoc-each
-- first2 has a bounds check, kill it
done:
- fix predicate propagation
- remove-mixin-instance takes a long time
-
[ [ 3 t ] [ f f ] if [ >fixnum ] when ]
- the >fixnum call should be optimized out - make ascii decoding faster, and avoid calling >fixnum and such
- perhaps interval inference is unsound for fixnum+fast etc?
-
[ [ { 1 } ] [ { 2 } ] if ] final-info
- should have a length - peg.javascript.parser take long time to load
- integer-integer partial ops should use
both-fixnums?
- partial dispatch ops should be flushable, and should have input-classes decls
- optimize out
both-fixnums?
where possible - inlining heuristic: number of usages within this word
- we want some way to define
HINTS:
for literal values - use the
specialized-def
when inlining? - above two items are motivated by a desire to specialize
<float-array>
for small arrays, to speed up raytracer and nbody - fixnum/i --> fixnum/i-fast if denominator is positive!
-
[ <=> ] sort [ <=> ] sort [ <=> ] sort
-- still doesn't scale linearly - don't inline push, unless we have a non-vector, pointless
- remove support for 'f' from intervals
- add identities for integer math
low level optimizer
- alien-unsigned-1: if the type of the alien is known, compile as ##call >fixnum then set-alien-unsigned-1
- review constant-propagation branch
-
tag { ... } dispatch
: we can eliminate a shift instruction there - the tagged comparison optimization is not actually sound
- global constant propagation
- peephole opt for consecutive height change instructions
- DCN inserts peeks too early
- global height change optimization
- integer>scalar and vector>scalar seems tacky
- rethink interaction between phi instructions and representation selection
- integer untagging
done:
- branch fusion
- VN should use identity comparison for ##load-literal instead of value equality
- combine VN with constant branch folding as a first step towards global optimization?
-
2 swap fixnum+fast
-
2 fixnum+fast 1 fixnum+fast
- DCE: set-slot on a freshly-allocated object should not mark it live
- powerpc: intrinsics for
fixnum+
, fixnum-
, fixnum
-
fixnum+
doesn't have an intrinsic anymore - powerpc: clean up stack-frame code and test spilling
- powerpc: enable float intrinsics
- dispatch: there's a potential jump to a jump here
- linux-ppc: save/restore nonvolatile fp registers
- dispatch branch alignment off
- spilling: align stack on windows (or use unaligned load/store insns?), leave room in stack frame for integer and float spills, resolve pass
- need an optimizer pass to get rid of jumps to returns and jumps to jumps
This revision created on Sat, 7 Mar 2020 22:13:28 by mrjbq7