Discussion:
double cpu on 3.7a0 vs 3.6.6 - fft?
Fredrik Olofsson
2014-10-09 17:49:20 UTC
Permalink
dear developers,
the code below is a stripped down version of a bigger piece of code.
on the same machine (macbook pro, 2.2ghz, 10.9.5), the new 3.7a0 version produces quite different results.

it takes ~30% on sc37a0
and only ~13% on sc366 and on sc357

first i though i did something silly building sc37a0 myself, but now with Scott's incremental builds (wow, many thanks) i see the same issue.

(
var buffersize= 2048;
var buffersize2= buffersize.div(2);
Routine.run{
var bus;
s.bootSync;
bus= Bus.control(s, buffersize2);
CmdPeriod.doOnce({bus.free});
SynthDef(\avTrk, {|bus|
var chain= FFT(LocalBuf(buffersize), SinOsc.ar);
Array.fill(buffersize2, {|i|
var dem= Demand.kr(chain>=0, 0, Unpack1FFT(chain, buffersize, i));
Out.kr(bus+i, dem);
});
}).load;
s.sync;
"starting".postln;
Synth(\avTrk, [\bus, bus]);
};
)

any ideas? is this expected behavior?
i ruled out Unpack1FFT but now suspect FFT.
_f


#|
fredrikolofsson.com musicalfieldsforever.com
|#
Dan Stowell
2014-10-09 19:10:04 UTC
Permalink
Hi f,

Sorry I am out of touch so I can't answer about the differences. All I
offer is some ramblings of an old man:

You might be able to use "Shark" (Mac dev tool... does it still
exist?) to inspect the CPU usage of the process and that could help
narrow it down. You normally use it with a debug build and it tells
you exactly which code has the hotspots, but in this situation it
might be helpful to run it on the non-debug builds which would tell
you in less detail but it would tell you which DLLs are running hot
(or the scsynth process). Then I think I would use "otool -L" to check
if the different builds were linking against different libs (e.g. just
to check that they're all actually using vDSP for their fft). Then if
it is indeed FFT, one would look over the log of changes since sc366.
Hmm, actually, can you run 3.7 plugins in 3.6 and vice versa? I don't
recall...

Dan
Post by Fredrik Olofsson
dear developers,
the code below is a stripped down version of a bigger piece of code.
on the same machine (macbook pro, 2.2ghz, 10.9.5), the new 3.7a0 version
produces quite different results.
it takes ~30% on sc37a0
and only ~13% on sc366 and on sc357
first i though i did something silly building sc37a0 myself, but now with
Scott's incremental builds (wow, many thanks) i see the same issue.
(
var buffersize= 2048;
var buffersize2= buffersize.div(2);
Routine.run{
var bus;
s.bootSync;
bus= Bus.control(s, buffersize2);
CmdPeriod.doOnce({bus.free});
SynthDef(\avTrk, {|bus|
var chain= FFT(LocalBuf(buffersize), SinOsc.ar);
Array.fill(buffersize2, {|i|
var dem= Demand.kr(chain>=0, 0, Unpack1FFT(chain, buffersize, i));
Out.kr(bus+i, dem);
});
}).load;
s.sync;
"starting".postln;
Synth(\avTrk, [\bus, bus]);
};
)
any ideas? is this expected behavior?
i ruled out Unpack1FFT but now suspect FFT.
_f
#|
fredrikolofsson.com musicalfieldsforever.com
|#
--
http://www.mcld.co.uk

_______________________________________________
sc-dev mailing list

info (subscription, etc.): http://www.beast.bham.ac.uk/research/sc_mailing_lists.shtml
archive: http://www.listarc.bham.ac.uk/marchives/sc-dev/
search: http://www.listarc.bham.ac.uk/lists/sc-dev/search/
Scott Wilson
2014-10-09 19:28:02 UTC
Permalink
Hey Fredrik,

I’m probably stating the obvious here, but I would guess that the incremental builds are debug config?

S.
Post by Fredrik Olofsson
dear developers,
the code below is a stripped down version of a bigger piece of code.
on the same machine (macbook pro, 2.2ghz, 10.9.5), the new 3.7a0 version produces quite different results.
it takes ~30% on sc37a0
and only ~13% on sc366 and on sc357
first i though i did something silly building sc37a0 myself, but now with Scott's incremental builds (wow, many thanks) i see the same issue.
(
var buffersize= 2048;
var buffersize2= buffersize.div(2);
Routine.run{
var bus;
s.bootSync;
bus= Bus.control(s, buffersize2);
CmdPeriod.doOnce({bus.free});
SynthDef(\avTrk, {|bus|
var chain= FFT(LocalBuf(buffersize), SinOsc.ar);
Array.fill(buffersize2, {|i|
var dem= Demand.kr(chain>=0, 0, Unpack1FFT(chain, buffersize, i));
Out.kr(bus+i, dem);
});
}).load;
s.sync;
"starting".postln;
Synth(\avTrk, [\bus, bus]);
};
)
any ideas? is this expected behavior?
i ruled out Unpack1FFT but now suspect FFT.
_f
#|
fredrikolofsson.com musicalfieldsforever.com
|#
Dan Stowell
2014-10-09 19:33:42 UTC
Permalink
Scott -

You may be onto something! The .travis.yml says
"-DCMAKE_BUILD_TYPE=Release" but only in the linux line, not in the
mac line. You might just have earnt yourself a sticker...

Dan

_______________________________________________
sc-dev mailing list

info (subscription, etc.): http://www.beast.bham.ac.uk/research/sc_mailing_lists.shtml
archive: http://www.listarc.bham.ac.uk/marchives/sc-dev/
search: http://www.listarc.bham.ac.uk/lists/sc-dev/search/
Fredrik Olofsson
2014-10-09 19:39:42 UTC
Permalink
good thinking, but no, i don't believe that's the problem. my own 37a0 build wasn't debug and i only used the incremental one to verify that it had the same speed issue.
also other scsynth benchmark tests i did (without fft) showed no difference.
_f
Post by Scott Wilson
Hey Fredrik,
I’m probably stating the obvious here, but I would guess that the incremental builds are debug config?
S.
#|
fredrikolofsson.com musicalfieldsforever.com
|#


_______________________________________________
sc-dev mailing list

info (subscription, etc.): http://www.beast.bham.ac.uk/research/sc_mailing_lists.shtml
archive: http://www.listarc.bham.ac.uk/marchives/sc-dev/
search: http://www.listarc.bham.ac.uk/lists/sc-dev/search/
Scott Wilson
2014-10-09 20:36:05 UTC
Permalink
If you’re sure about that I’d suggest:

1) Call dumpUGens on the SynthDef to make sure it’s generating the same graph in both cases.
2) You could start comparing the various files involved to see if there’s an obvious difference. There’re are a number of places where this could have happened. SC_fftlib.cpp would be a good place, but could also be a change in cmake causing it to compile with a different library, etc.

S.
Post by Fredrik Olofsson
good thinking, but no, i don't believe that's the problem. my own 37a0 build wasn't debug and i only used the incremental one to verify that it had the same speed issue.
also other scsynth benchmark tests i did (without fft) showed no difference.
_f
Post by Scott Wilson
Hey Fredrik,
I’m probably stating the obvious here, but I would guess that the incremental builds are debug config?
S.
#|
fredrikolofsson.com musicalfieldsforever.com
|#
_______________________________________________
sc-dev mailing list
info (subscription, etc.): http://www.beast.bham.ac.uk/research/sc_mailing_lists.shtml
archive: http://www.listarc.bham.ac.uk/marchives/sc-dev/
search: http://www.listarc.bham.ac.uk/lists/sc-dev/search/
_______________________________________________
sc-dev mailing list

info (subscription, etc.): http://www.beast.bham.ac.uk/research/sc_mailing_lists.shtml
archive: http://www.listarc.bham.ac.uk/marchives/sc-dev/
search: http://www.listarc.bham.ac.uk/lists/sc-dev/search/
Scott Carver
2014-10-09 21:39:59 UTC
Permalink
Hmmm, the results here are weird. Here's what turns up in profiling, diffed
between 3.6.6 and 3.7a:

Running Time Self Symbol Name
+492.0ms +73.1% 492.0 PV_Copy_next
+190.0ms +28.2% 190.0 Demand_next_aa
+100.0ms +14.8% 100.0 Graph_Calc
+29.0ms +4.3% 29.0 add_1
+16.0ms +2.3% 16.0 Out_next_k


(Btw, you can do this in Instruments, w/ the time profiler. Do a run on
both apps, and then Instrument -> Compare Call Trees)
Okay, PV_Copy is the cause.... ?

Try this:

(
var buffersize= 2048;
var buffersize2= buffersize.div(2);
Routine.run{
var bus, ugens, def;
bus= Bus.control(s, buffersize2);
CmdPeriod.doOnce({bus.free});
{
def = SynthDef(\avTrk, {|bus|
var chain= FFT(LocalBuf(buffersize), SinOsc.ar);
Array.fill(buffersize2, {|i|
var dem= Demand.kr(chain>=0, 0, Unpack1FFT(chain, buffersize, i));
Out.kr(bus+i, dem);
});
Line.kr(0, 1, 10, doneAction:2);
});
"child ugens: %".format(def.children.size).postln;
ugens = ();
def.children.do {
|child|
var count = ugens.atFail(child.class, 0);
count = count + 1;
ugens[child.class] = count;
};
ugens.keysValuesDo {
|class, count|
"\t%: %".format(class.name, count).postln;
};
}.defer;
s.sync;
n = Synth(\avTrk, [\bus, ~bus]);
};
)

And I see:

// 3.7:
child ugens: 8194
BinaryOpUGen: 2047
Demand: 1024
Unpack1FFT: 1024
Out: 1024
* LocalBuf: 1024*
* BufFrames: 1023*
* PV_Copy: 1023*
Control: 1
MaxLocalBufs: 1
FFT: 1
SinOsc: 1
Line: 1
// 3.6.6
child ugens: 5125
BinaryOpUGen: 2047
Demand: 1024
Unpack1FFT: 1024
Out: 1024
* LocalBuf: 1*
Control: 1
MaxLocalBufs: 1
FFT: 1
SinOsc: 1
Line: 1


The two outliers - BufFrames and PV_Copy - are worth looking at.... looks
like BufFrames is getting created by PV_ChainUGen:addCopiesIfNeeded, which
was introduced in this change
<https://github.com/supercollider/supercollider/commit/15ff964d669c41addfd4affb41eca083b3144c9b>,
so that seems like an obvious culprit. My guess is the addCopiesIfNeeded
logic isn't catching a case where a copy isn't actually needed? This looks
like a bug for Scott W since he was the author.


- S
Post by Scott Wilson
Hey Fredrik,
I’m probably stating the obvious here, but I would guess that the
incremental builds are debug config?
S.
dear developers,
the code below is a stripped down version of a bigger piece of code.
on the same machine (macbook pro, 2.2ghz, 10.9.5), the new 3.7a0 version
produces quite different results.
it takes ~30% on sc37a0
and only ~13% on sc366 and on sc357
first i though i did something silly building sc37a0 myself, but now with
Scott's incremental builds (wow, many thanks) i see the same issue.
(
var buffersize= 2048;
var buffersize2= buffersize.div(2);
Routine.run{
var bus;
s.bootSync;
bus= Bus.control(s, buffersize2);
CmdPeriod.doOnce({bus.free});
SynthDef(\avTrk, {|bus|
var chain= FFT(LocalBuf(buffersize), SinOsc.ar <http://sinosc.ar/>);
Array.fill(buffersize2, {|i|
var dem= Demand.kr <http://demand.kr/>(chain>=0, 0, Unpack1FFT(chain,
buffersize, i));
Out.kr <http://out.kr/>(bus+i, dem);
});
}).load;
s.sync;
"starting".postln;
Synth(\avTrk, [\bus, bus]);
};
)
any ideas? is this expected behavior?
i ruled out Unpack1FFT but now suspect FFT.
_f
#|
fredrikolofsson.com musicalfieldsforever.com
|#
Scott Carver
2014-10-09 21:45:41 UTC
Permalink
Mmmmm, I have a slow-burning project to add unit / pinning tests for UGens
and SynthDef graphs to flag this sort of thing when it's introduced.
Hopefully I can stand something up sooner rather than later - there are a
handful of bad bugs I would love to fix, and some features as well, that
are simply to scary to make w/o some way to validate that the correct
rendering is still happening....

- Scott Carver
Post by Scott Carver
Hmmm, the results here are weird. Here's what turns up in profiling,
Running Time Self Symbol Name
+492.0ms +73.1% 492.0 PV_Copy_next
+190.0ms +28.2% 190.0 Demand_next_aa
+100.0ms +14.8% 100.0 Graph_Calc
+29.0ms +4.3% 29.0 add_1
+16.0ms +2.3% 16.0 Out_next_k
(Btw, you can do this in Instruments, w/ the time profiler. Do a run on
both apps, and then Instrument -> Compare Call Trees)
Okay, PV_Copy is the cause.... ?
(
var buffersize= 2048;
var buffersize2= buffersize.div(2);
Routine.run{
var bus, ugens, def;
bus= Bus.control(s, buffersize2);
CmdPeriod.doOnce({bus.free});
{
def = SynthDef(\avTrk, {|bus|
var chain= FFT(LocalBuf(buffersize), SinOsc.ar);
Array.fill(buffersize2, {|i|
var dem= Demand.kr(chain>=0, 0, Unpack1FFT(chain, buffersize, i));
Out.kr(bus+i, dem);
});
Line.kr(0, 1, 10, doneAction:2);
});
"child ugens: %".format(def.children.size).postln;
ugens = ();
def.children.do {
|child|
var count = ugens.atFail(child.class, 0);
count = count + 1;
ugens[child.class] = count;
};
ugens.keysValuesDo {
|class, count|
"\t%: %".format(class.name, count).postln;
};
}.defer;
s.sync;
n = Synth(\avTrk, [\bus, ~bus]);
};
)
child ugens: 8194
BinaryOpUGen: 2047
Demand: 1024
Unpack1FFT: 1024
Out: 1024
* LocalBuf: 1024*
* BufFrames: 1023*
* PV_Copy: 1023*
Control: 1
MaxLocalBufs: 1
FFT: 1
SinOsc: 1
Line: 1
// 3.6.6
child ugens: 5125
BinaryOpUGen: 2047
Demand: 1024
Unpack1FFT: 1024
Out: 1024
* LocalBuf: 1*
Control: 1
MaxLocalBufs: 1
FFT: 1
SinOsc: 1
Line: 1
The two outliers - BufFrames and PV_Copy - are worth looking at.... looks
like BufFrames is getting created by PV_ChainUGen:addCopiesIfNeeded, which
was introduced in this change
<https://github.com/supercollider/supercollider/commit/15ff964d669c41addfd4affb41eca083b3144c9b>,
so that seems like an obvious culprit. My guess is the addCopiesIfNeeded
logic isn't catching a case where a copy isn't actually needed? This looks
like a bug for Scott W since he was the author.
- S
Post by Scott Wilson
Hey Fredrik,
I’m probably stating the obvious here, but I would guess that the
incremental builds are debug config?
S.
dear developers,
the code below is a stripped down version of a bigger piece of code.
on the same machine (macbook pro, 2.2ghz, 10.9.5), the new 3.7a0 version
produces quite different results.
it takes ~30% on sc37a0
and only ~13% on sc366 and on sc357
first i though i did something silly building sc37a0 myself, but now with
Scott's incremental builds (wow, many thanks) i see the same issue.
(
var buffersize= 2048;
var buffersize2= buffersize.div(2);
Routine.run{
var bus;
s.bootSync;
bus= Bus.control(s, buffersize2);
CmdPeriod.doOnce({bus.free});
SynthDef(\avTrk, {|bus|
var chain= FFT(LocalBuf(buffersize), SinOsc.ar <http://sinosc.ar/>);
Array.fill(buffersize2, {|i|
var dem= Demand.kr <http://demand.kr/>(chain>=0, 0, Unpack1FFT(chain,
buffersize, i));
Out.kr <http://out.kr/>(bus+i, dem);
});
}).load;
s.sync;
"starting".postln;
Synth(\avTrk, [\bus, bus]);
};
)
any ideas? is this expected behavior?
i ruled out Unpack1FFT but now suspect FFT.
_f
#|
fredrikolofsson.com musicalfieldsforever.com
|#
Scott Wilson
2014-10-10 06:37:22 UTC
Permalink
That would be great! Although, having had some experience trying to sort out the various problem in SC with ‘over-‘optimisation of the UGen graph, I have to say that part of the problem is designing exhaustive test cases.

S.
Mmmmm, I have a slow-burning project to add unit / pinning tests for UGens and SynthDef graphs to flag this sort of thing when it's introduced. Hopefully I can stand something up sooner rather than later - there are a handful of bad bugs I would love to fix, and some features as well, that are simply to scary to make w/o some way to validate that the correct rendering is still happening....
- Scott Carver
Running Time Self Symbol Name
+492.0ms +73.1% 492.0 PV_Copy_next
+190.0ms +28.2% 190.0 Demand_next_aa
+100.0ms +14.8% 100.0 Graph_Calc
+29.0ms +4.3% 29.0 add_1
+16.0ms +2.3% 16.0 Out_next_k
(Btw, you can do this in Instruments, w/ the time profiler. Do a run on both apps, and then Instrument -> Compare Call Trees)
Okay, PV_Copy is the cause.... ?
(
var buffersize= 2048;
var buffersize2= buffersize.div(2);
Routine.run{
var bus, ugens, def;
bus= Bus.control(s, buffersize2);
CmdPeriod.doOnce({bus.free});
{
def = SynthDef(\avTrk, {|bus|
var chain= FFT(LocalBuf(buffersize), SinOsc.ar);
Array.fill(buffersize2, {|i|
var dem= Demand.kr(chain>=0, 0, Unpack1FFT(chain, buffersize, i));
Out.kr(bus+i, dem);
});
Line.kr(0, 1, 10, doneAction:2);
});
"child ugens: %".format(def.children.size).postln;
ugens = ();
def.children.do {
|child|
var count = ugens.atFail(child.class, 0);
count = count + 1;
ugens[child.class] = count;
};
ugens.keysValuesDo {
|class, count|
"\t%: %".format(class.name, count).postln;
};
}.defer;
s.sync;
n = Synth(\avTrk, [\bus, ~bus]);
};
)
child ugens: 8194
BinaryOpUGen: 2047
Demand: 1024
Unpack1FFT: 1024
Out: 1024
LocalBuf: 1024
BufFrames: 1023
PV_Copy: 1023
Control: 1
MaxLocalBufs: 1
FFT: 1
SinOsc: 1
Line: 1
// 3.6.6
child ugens: 5125
BinaryOpUGen: 2047
Demand: 1024
Unpack1FFT: 1024
Out: 1024
LocalBuf: 1
Control: 1
MaxLocalBufs: 1
FFT: 1
SinOsc: 1
Line: 1
The two outliers - BufFrames and PV_Copy - are worth looking at.... looks like BufFrames is getting created by PV_ChainUGen:addCopiesIfNeeded, which was introduced in this change, so that seems like an obvious culprit. My guess is the addCopiesIfNeeded logic isn't catching a case where a copy isn't actually needed? This looks like a bug for Scott W since he was the author.
- S
Hey Fredrik,
I’m probably stating the obvious here, but I would guess that the incremental builds are debug config?
S.
Post by Fredrik Olofsson
dear developers,
the code below is a stripped down version of a bigger piece of code.
on the same machine (macbook pro, 2.2ghz, 10.9.5), the new 3.7a0 version produces quite different results.
it takes ~30% on sc37a0
and only ~13% on sc366 and on sc357
first i though i did something silly building sc37a0 myself, but now with Scott's incremental builds (wow, many thanks) i see the same issue.
(
var buffersize= 2048;
var buffersize2= buffersize.div(2);
Routine.run{
var bus;
s.bootSync;
bus= Bus.control(s, buffersize2);
CmdPeriod.doOnce({bus.free});
SynthDef(\avTrk, {|bus|
var chain= FFT(LocalBuf(buffersize), SinOsc.ar);
Array.fill(buffersize2, {|i|
var dem= Demand.kr(chain>=0, 0, Unpack1FFT(chain, buffersize, i));
Out.kr(bus+i, dem);
});
}).load;
s.sync;
"starting".postln;
Synth(\avTrk, [\bus, bus]);
};
)
any ideas? is this expected behavior?
i ruled out Unpack1FFT but now suspect FFT.
_f
#|
fredrikolofsson.com musicalfieldsforever.com
|#
Scott Carver
2014-10-11 18:39:25 UTC
Permalink
I wonder if it would be enough to be able to hash the ugen graph, and then
compare from build to build for a bunch of canonical test synthdefs. If we
were careful about building the test synthdefs so that a failure would be
informative (i.e. they do one or two specific things), this could probably
help to pin the expectations in place at least.

Of course, there are legit changes in cases of optimization to - but,
changing the ugen graph is a big big deal, so anyone making a change like
that should be expecting a slog through to verify that synthdef differences
were as expected.

- S
Post by Scott Wilson
That would be great! Although, having had some experience trying to sort
out the various problem in SC with ‘over-‘optimisation of the UGen graph, I
have to say that part of the problem is designing exhaustive test cases.
S.
Mmmmm, I have a slow-burning project to add unit / pinning tests for UGens
and SynthDef graphs to flag this sort of thing when it's introduced.
Hopefully I can stand something up sooner rather than later - there are a
handful of bad bugs I would love to fix, and some features as well, that
are simply to scary to make w/o some way to validate that the correct
rendering is still happening....
- Scott Carver
Post by Scott Carver
Hmmm, the results here are weird. Here's what turns up in profiling,
Running Time Self Symbol Name
+492.0ms +73.1% 492.0 PV_Copy_next
+190.0ms +28.2% 190.0 Demand_next_aa
+100.0ms +14.8% 100.0 Graph_Calc
+29.0ms +4.3% 29.0 add_1
+16.0ms +2.3% 16.0 Out_next_k
(Btw, you can do this in Instruments, w/ the time profiler. Do a run on
both apps, and then Instrument -> Compare Call Trees)
Okay, PV_Copy is the cause.... ?
(
var buffersize= 2048;
var buffersize2= buffersize.div(2);
Routine.run{
var bus, ugens, def;
bus= Bus.control(s, buffersize2);
CmdPeriod.doOnce({bus.free});
{
def = SynthDef(\avTrk, {|bus|
var chain= FFT(LocalBuf(buffersize), SinOsc.ar);
Array.fill(buffersize2, {|i|
var dem= Demand.kr(chain>=0, 0, Unpack1FFT(chain, buffersize, i));
Out.kr(bus+i, dem);
});
Line.kr(0, 1, 10, doneAction:2);
});
"child ugens: %".format(def.children.size).postln;
ugens = ();
def.children.do {
|child|
var count = ugens.atFail(child.class, 0);
count = count + 1;
ugens[child.class] = count;
};
ugens.keysValuesDo {
|class, count|
"\t%: %".format(class.name, count).postln;
};
}.defer;
s.sync;
n = Synth(\avTrk, [\bus, ~bus]);
};
)
child ugens: 8194
BinaryOpUGen: 2047
Demand: 1024
Unpack1FFT: 1024
Out: 1024
* LocalBuf: 1024*
* BufFrames: 1023*
* PV_Copy: 1023*
Control: 1
MaxLocalBufs: 1
FFT: 1
SinOsc: 1
Line: 1
// 3.6.6
child ugens: 5125
BinaryOpUGen: 2047
Demand: 1024
Unpack1FFT: 1024
Out: 1024
* LocalBuf: 1*
Control: 1
MaxLocalBufs: 1
FFT: 1
SinOsc: 1
Line: 1
The two outliers - BufFrames and PV_Copy - are worth looking at.... looks
like BufFrames is getting created by PV_ChainUGen:addCopiesIfNeeded, which
was introduced in this change
<https://github.com/supercollider/supercollider/commit/15ff964d669c41addfd4affb41eca083b3144c9b>,
so that seems like an obvious culprit. My guess is the addCopiesIfNeeded
logic isn't catching a case where a copy isn't actually needed? This looks
like a bug for Scott W since he was the author.
- S
Post by Scott Wilson
Hey Fredrik,
I’m probably stating the obvious here, but I would guess that the
incremental builds are debug config?
S.
dear developers,
the code below is a stripped down version of a bigger piece of code.
on the same machine (macbook pro, 2.2ghz, 10.9.5), the new 3.7a0 version
produces quite different results.
it takes ~30% on sc37a0
and only ~13% on sc366 and on sc357
first i though i did something silly building sc37a0 myself, but now
with Scott's incremental builds (wow, many thanks) i see the same issue.
(
var buffersize= 2048;
var buffersize2= buffersize.div(2);
Routine.run{
var bus;
s.bootSync;
bus= Bus.control(s, buffersize2);
CmdPeriod.doOnce({bus.free});
SynthDef(\avTrk, {|bus|
var chain= FFT(LocalBuf(buffersize), SinOsc.ar <http://sinosc.ar/>);
Array.fill(buffersize2, {|i|
var dem= Demand.kr <http://demand.kr/>(chain>=0, 0, Unpack1FFT(chain,
buffersize, i));
Out.kr <http://out.kr/>(bus+i, dem);
});
}).load;
s.sync;
"starting".postln;
Synth(\avTrk, [\bus, bus]);
};
)
any ideas? is this expected behavior?
i ruled out Unpack1FFT but now suspect FFT.
_f
#|
fredrikolofsson.com musicalfieldsforever.com
|#
Scott Wilson
2014-10-10 06:34:37 UTC
Permalink
Running Time Self Symbol Name
+492.0ms +73.1% 492.0 PV_Copy_next
+190.0ms +28.2% 190.0 Demand_next_aa
+100.0ms +14.8% 100.0 Graph_Calc
+29.0ms +4.3% 29.0 add_1
+16.0ms +2.3% 16.0 Out_next_k
(Btw, you can do this in Instruments, w/ the time profiler. Do a run on both apps, and then Instrument -> Compare Call Trees)
Okay, PV_Copy is the cause.... ?
(
var buffersize= 2048;
var buffersize2= buffersize.div(2);
Routine.run{
var bus, ugens, def;
bus= Bus.control(s, buffersize2);
CmdPeriod.doOnce({bus.free});
{
def = SynthDef(\avTrk, {|bus|
var chain= FFT(LocalBuf(buffersize), SinOsc.ar);
Array.fill(buffersize2, {|i|
var dem= Demand.kr(chain>=0, 0, Unpack1FFT(chain, buffersize, i));
Out.kr(bus+i, dem);
});
Line.kr(0, 1, 10, doneAction:2);
});
"child ugens: %".format(def.children.size).postln;
ugens = ();
def.children.do {
|child|
var count = ugens.atFail(child.class, 0);
count = count + 1;
ugens[child.class] = count;
};
ugens.keysValuesDo {
|class, count|
"\t%: %".format(class.name, count).postln;
};
}.defer;
s.sync;
n = Synth(\avTrk, [\bus, ~bus]);
};
)
child ugens: 8194
BinaryOpUGen: 2047
Demand: 1024
Unpack1FFT: 1024
Out: 1024
LocalBuf: 1024
BufFrames: 1023
PV_Copy: 1023
Control: 1
MaxLocalBufs: 1
FFT: 1
SinOsc: 1
Line: 1
// 3.6.6
child ugens: 5125
BinaryOpUGen: 2047
Demand: 1024
Unpack1FFT: 1024
Out: 1024
LocalBuf: 1
Control: 1
MaxLocalBufs: 1
FFT: 1
SinOsc: 1
Line: 1
The two outliers - BufFrames and PV_Copy - are worth looking at.... looks like BufFrames is getting created by PV_ChainUGen:addCopiesIfNeeded, which was introduced in this change, so that seems like an obvious culprit. My guess is the addCopiesIfNeeded logic isn't catching a case where a copy isn't actually needed? This looks like a bug for Scott W since he was the author.
Yes, I had a suspicion, which is why I suggested checking the graph. At a glance it’s not an obvious fix though, as it should be excluding Unpack1FFT from this already. I’m a bit busy now, so Fredrik, can you file a bug report and I’ll check it when I have time?

S.
Fredrik Olofsson
2014-10-12 18:11:47 UTC
Permalink
Running Time Self Symbol Name
+492.0ms +73.1% 492.0 PV_Copy_next
+190.0ms +28.2% 190.0 Demand_next_aa
+100.0ms +14.8% 100.0 Graph_Calc
+29.0ms +4.3% 29.0 add_1
+16.0ms +2.3% 16.0 Out_next_k
(Btw, you can do this in Instruments, w/ the time profiler. Do a run on both apps, and then Instrument -> Compare Call Trees)
Okay, PV_Copy is the cause.... ?
(
var buffersize= 2048;
var buffersize2= buffersize.div(2);
Routine.run{
var bus, ugens, def;
bus= Bus.control(s, buffersize2);
CmdPeriod.doOnce({bus.free});
{
def = SynthDef(\avTrk, {|bus|
var chain= FFT(LocalBuf(buffersize), SinOsc.ar);
Array.fill(buffersize2, {|i|
var dem= Demand.kr(chain>=0, 0, Unpack1FFT(chain, buffersize, i));
Out.kr(bus+i, dem);
});
Line.kr(0, 1, 10, doneAction:2);
});
"child ugens: %".format(def.children.size).postln;
ugens = ();
def.children.do {
|child|
var count = ugens.atFail(child.class, 0);
count = count + 1;
ugens[child.class] = count;
};
ugens.keysValuesDo {
|class, count|
"\t%: %".format(class.name, count).postln;
};
}.defer;
s.sync;
n = Synth(\avTrk, [\bus, ~bus]);
};
)
child ugens: 8194
BinaryOpUGen: 2047
Demand: 1024
Unpack1FFT: 1024
Out: 1024
LocalBuf: 1024
BufFrames: 1023
PV_Copy: 1023
Control: 1
MaxLocalBufs: 1
FFT: 1
SinOsc: 1
Line: 1
// 3.6.6
child ugens: 5125
BinaryOpUGen: 2047
Demand: 1024
Unpack1FFT: 1024
Out: 1024
LocalBuf: 1
Control: 1
MaxLocalBufs: 1
FFT: 1
SinOsc: 1
Line: 1
The two outliers - BufFrames and PV_Copy - are worth looking at.... looks like BufFrames is getting created by PV_ChainUGen:addCopiesIfNeeded, which was introduced in this change, so that seems like an obvious culprit. My guess is the addCopiesIfNeeded logic isn't catching a case where a copy isn't actually needed? This looks like a bug for Scott W since he was the author.
Yes, I had a suspicion, which is why I suggested checking the graph. At a glance it’s not an obvious fix though, as it should be excluding Unpack1FFT from this already. I’m a bit busy now, so Fredrik, can you file a bug report and I’ll check it when I have time?
ok, done. https://github.com/supercollider/supercollider/issues/1202
and thanks scott and scott for investigating and giving lots of useful debugging tips. appreciated.
_f

#|
fredrikolofsson.com musicalfieldsforever.com
|#


_______________________________________________
sc-dev mailing list

info (subscription, etc.): http://www.beast.bham.ac.uk/research/sc_mailing_lists.shtml
archive: http://www.listarc.bham.ac.uk/marchives/sc-dev/
search: http://www.listarc.bham.ac.uk/lists/sc-dev/search/

Loading...