All times are UTC-06:00




Post new topic  Reply to topic  [ 21 posts ] 
Author Message
PostPosted: Mon Jul 19, 2010 7:37 pm 
Offline

Joined: Tue Mar 31, 2009 10:24 pm
Posts: 171
I know it may not be the best time for such projects on the platform, but since the project's initial phase is done, i can just as well bring it to the attention of my fellow powerdevelopers:

ntsh-jass

It's a pet project of mine that has been developed on and off over the span of a few years now, across several platforms. Project started off as a testbed for scene graph experiments, and was largely neglected after it served its initial purpose. Until earlier this year when i needed a lightweight shader prototyping and demonstration framework on my job; at this moment i added a simple 'material system', and branched a GLSL-targeting base, and well, did not stop there, but subsequently moved onto ES2. As you can guess, that last part occurred on the EfikaMX (yes, i'm that guy who keeps babbling about using low-power computers as graphics development stations).

Long story short, ntsh-jass is a low-level abstraction (a really thin layer) over the GL API, in all its shader incarnations - ARB_program, GLSL -desktop and -embedded. It constitutes of a scene graph, a shader-based material system, and a couple of mesh format readers. It's useful as a testbed for shader prototyping and evaluation, or as a backend to higher-level parts of a game engine pipeline (i guess i could share with you that its original scene graph was used in a proprietary commercial engine, with shipped products). It's been run on quite a few platforms, some of which are not supported anymore, but the ES2 branch runs just fine on the current EfikaMX with the 2010.04.05 kernel and userspace GL components. As a matter of fact, i did not even have to dumb down the few shaders from the desktop versions (not that they're complex, but still).

Project's structure is a bit messy, but the codebase is quite readable (as the project served an academic purpose at one stage too). So I guess that's about it. If anybody finds it useful for something of their own - just shout for help. I might use it for a commercial project of my own, but that is still in early design stages.

Cheers.

ps: a couple of screenshots from the testbed mesh viewer app run on EfikaMX:
Image Image
(model found in the internet, by an unknown author; let me know if you know whom i should credit)

pps: for convenience, here's the tiny stand-alone shader performance test from the other thread.


Last edited by blu on Sat Aug 28, 2010 8:21 am, edited 2 times in total.

Top
   
PostPosted: Thu Jul 22, 2010 8:24 pm 
Offline
Site Admin

Joined: Fri Sep 24, 2004 1:39 am
Posts: 1589
Location: Austin, TX
Quote:
I know it may not be the best time for such projects on the platform, but since the project's initial phase is done, i can just as well bring it to the attention of my fellow powerdevelopers:

ntsh-jass

It's a pet project of mine that has been developed on and off over the span of a few years now, across several platforms. Project started off as a testbed for scene graph experiments, and was largely neglected after it served its initial purpose. Until earlier this year when i needed a lightweight shader prototyping and demonstration framework on my job; at this moment i added a simple 'material system', and branched a GLSL-targeting base, and well, did not stop there, but subsequently moved onto ES2. As you can guess, that last part occurred on the EfikaMX (yes, i'm that guy who keeps babbling about using low-power computers as graphics development stations).

Long story short, ntsh-jass is a low-level abstraction (a really thin layer) over the GL API, in all its shader incarnations - ARB_program, GLSL -desktop and -embedded. It constitutes of a scene graph, a shader-based material system, and a couple of mesh format readers. It's useful as a testbed for shader prototyping and evaluation, or as a backend to higher-level parts of a game engine pipeline (i guess i could share with you that its original scene graph was used in a proprietary commercial engine, with shipped products). It's been run on quite a few platforms, some of which are not supported anymore, but the ES2 branch runs just fine on the current EfikaMX with the 2010.04.05 kernel and userspace GL components. As a matter of fact, i did not even have to dumb down the few shaders from the desktop versions (not that they're complex, but still).

Project's structure is a bit messy, but the codebase is quite readable (as the project served an academic purpose at one stage too). So I guess that's about it. If anybody finds it useful for something of their own - just shout for help. I might use it for a commercial project of my own, but that is still in early design stages.

Cheers.
Very nice. We will endeavour to release a new GL library once we've worked through a few issues, and make sure we keep your code working :)

What is the performance like in your opinion? You said you didn't have to dumb it down, does this mean you got better than you expected?

_________________
Matt Sealey


Top
   
PostPosted: Fri Jul 23, 2010 2:06 pm 
Offline

Joined: Tue Mar 31, 2009 10:24 pm
Posts: 171
Quote:
Very nice. We will endeavour to release a new GL library once we've worked through a few issues, and make sure we keep your code working :)
Thanks, Matt. I would truly appreciate a new GL drop. And generally, thank you, Genesi, for the commitment to keep efikaMX hospitable to guys like me. Re the viability of ntsh-jass, it really does not take much, as code has the ability to 'auto-repair' itself, when I'm nearby a keyboard, anyway ; )
Quote:
What is the performance like in your opinion? You said you didn't have to dumb it down, does this mean you got better than you expected?
Well, it did very much in line with what I expected, from the background of my earlier experiments on the platform. I did not have to dumb down any shaders in terms of shader programming model - the z430 has a 'grown-up'-enough model that it can take low-to-mid-complexity desktop shaders unmodified, and execute them diligently (i.e. at the original precision, etc).

About speed bottlenecks - those do not appear to be currently on the GPU side - the CPU was constantly using a good portion of the frame time; CPU load never dropped much below 30%, for the heaviest of GPU workloads (same would normally sit at around 2% CPU on a well-greased GL pipeline, as the app does next to nothing on the CPU, per frame), and would rise as high as 70% for the lightest of GPU workloads, while not hitting a particularly high FPS (in comparison, figure would be at ~15, to 20% on a streamlined GL pipeline). These observations speak of the CPU's 'over-involvement' in the frame cycle somewhere down the pipeline. But I don't think I'm telling you anything new here.

Re shader complexity - the most complex of shaders I've thrown at the z430 yet was the one from the other thread with the little benchmark app, and efikaMX did well for such a complex shader (as in 'for a benchmark, not for practical applications'); despite the software blitting hampering the z430, only one other embedded platform I've tested on has been able to top that so far (and it wasn't the tegra250 ;). In ntsh-jass, though, I used more practical shaders (but still desktop-grade meshes), doing 32-bone skinning, with and without morphing, and also multi-light per-pixel illumination with up to 3 phong lights, and they all showed enough potential that if the rest of the pipeline was not dragging the framerate down, those shaders would've been usable in practical scenarios on the efikaMX (at becoming SD resolutions, naturally). Generally, if a real-life shader setup, run at SD resolutions, does low-to-mid teen's FPS, while consuming 30% or more of CPU, then there's reason to expect hitting 20's of FPS at 'good' CPU behaviour, before any developer's effort to actually optimise for the platform. That's already talking business to me.


Top
   
PostPosted: Sun Jul 25, 2010 9:27 am 
Offline
Site Admin

Joined: Fri Sep 24, 2004 1:39 am
Posts: 1589
Location: Austin, TX
Quote:
These observations speak of the CPU's 'over-involvement' in the frame cycle somewhere down the pipeline. But I don't think I'm telling you anything new here.
Have you tried running it through gprof? The GL libs are shipped with debugging symbols so you should be able to see exactly where the app is spending most of it's time in the GL libs.

It'd be nice to know what you think the best place to start would be.

_________________
Matt Sealey


Top
   
 Post subject:
PostPosted: Sun Jul 25, 2010 10:44 pm 
Offline

Joined: Tue Mar 31, 2009 10:24 pm
Posts: 171
first profile session ended up ingloriously - gprof ran out of memory after ~45min of crunching (it tried to allocate a gigabyte, to which, naturally, efikaMX minded). apparently that -f <function_that_draws_a_frame> narrowing of the call tree i had specified was not narrow enough for effective profiling. will return to it one of the following evenings. in the meantime, if somebody knows how to be extremely efficient with gprof, please, do share your lore here.


Top
   
 Post subject:
PostPosted: Mon Jul 26, 2010 4:54 am 
Offline
Site Admin

Joined: Fri Sep 24, 2004 1:39 am
Posts: 1589
Location: Austin, TX
Quote:
first profile session ended up ingloriously - gprof ran out of memory after ~45min of crunching (it tried to allocate a gigabyte, to which, naturally, efikaMX minded). apparently that -f <function_that_draws_a_frame> narrowing of the call tree i had specified was not narrow enough for effective profiling. will return to it one of the following evenings. in the meantime, if somebody knows how to be extremely efficient with gprof, please, do share your lore here.
With the TO2 kernel I did put ramzswap in there and enabled it, you should be able to use a compressed swap drive and use a swapfile as backing store (which is also compressed) to give you a little more memory. You'll need the compcache 0.6.2 utils from

http://compcache.googlecode.com/

In order to set it up properly but actually you could do something like this in /etc/rc.local to get you started (put it BEFORE 'exit 0' :):
Code:
modprobe ramzswap num_devices=3 disksize_kb=262144
udevadm settle
swapon /dev/ramzswap0
It should alleviate a little of the memory issues, and actually makes your system run faster :]

_________________
Matt Sealey


Top
   
 Post subject:
PostPosted: Tue Jul 27, 2010 10:55 pm 
Offline

Joined: Tue Mar 31, 2009 10:24 pm
Posts: 171
Matt, ramzswap did not resolve the issue. Perhaps I'm erroneously interpreting the reported condition?
Code:
$ LD_LIBRARY_PATH=/usr/local/lib/ DISPLAY=:0.0 gprof -f eglSwapBuffers ./testbed assets/cube.mesh -no_mini_view -frames 10

gprof: out of memory allocating 1080714280 bytes after a total of 3198976 bytes
BTW, i used the ramzswap.ko from the 2010.04.05 modules tar. I also have a freshly built 0.6.2, but as the prebuilt one seems to work ok, i just used that.
Code:
$ cat /proc/swaps
Filename Type Size Used Priority
/dev/ramzswap0 partition 262136 0 -1


Top
   
 Post subject:
PostPosted: Sat Aug 07, 2010 12:12 am 
Offline
Site Admin

Joined: Fri Sep 24, 2004 1:39 am
Posts: 1589
Location: Austin, TX
Quote:
Matt, ramzswap did not resolve the issue. Perhaps I'm erroneously interpreting the reported condition?
Code:
$ LD_LIBRARY_PATH=/usr/local/lib/ DISPLAY=:0.0 gprof -f eglSwapBuffers ./testbed assets/cube.mesh -no_mini_view -frames 10

gprof: out of memory allocating 1080714280 bytes after a total of 3198976 bytes
BTW, i used the ramzswap.ko from the 2010.04.05 modules tar. I also have a freshly built 0.6.2, but as the prebuilt one seems to work ok, i just used that.
Code:
$ cat /proc/swaps
Filename Type Size Used Priority
/dev/ramzswap0 partition 262136 0 -1
You're going to have to give it a couple gigabytes of disk swap space too, unfortunately.

The ramzswap 0.6.2 will come with a tool that will let you compress the swapfile on disk - you just have to load the module with num_devices=3 or something, and then basically make one in ram, one on disk... and you can use the spare one to add another one on another disk if need be. The docs are all in the rzscontrol manpage.

The reason to use ramzswap: yes it will slow the system down a teeny bit due to compression and decompression of pages but the reduction in latency to read from ram instead of disk, and then to read less from disk and write less TO disk, really makes up for it. What you lose in CPU time you make up for in less IO wait time, so in theory it should be using more of the CPU, and you don't have to have a 2GB swap file to get 2GB of swap space :)

_________________
Matt Sealey


Top
   
 Post subject:
PostPosted: Sat Aug 07, 2010 9:31 pm 
Offline

Joined: Tue Mar 31, 2009 10:24 pm
Posts: 171
thanks, Matt! that's all new stuff to me (in case it was not apparent : ) i'll resume the profiling attempts shortly.


Top
   
 Post subject:
PostPosted: Wed Aug 11, 2010 8:31 pm 
Offline

Joined: Tue Mar 31, 2009 10:24 pm
Posts: 171
i just finished my first successful gprof session. even though it was both educational and self-embarrassing (i don't recall when was the last time i did so many things wrong in setting up a simple task), it did not bring the much desired enlightenment to this rural corner of the universe. after reading up a bit on what the heck i was doing, i'm practically confident now that gprof takes proper instrumenting of everything that is intended to be seen in the profile. i.e. the presence of debug information alone is insufficient for the profiling of a routine. as a result, nothing outside of my instrumented application code is currently visible in the profile charts. on the plus side, it pinpointed one place in my code as a candidate for optimisation on the cortex a8.

hereupon, i plea for a -pg build (ie. gprof-instrumented) of the es/egl userspace binaries. if possible, Matt, when you have the time.

ps: a piece of advice unrelated to the topic: never, ever, for the love your cat, use -c option when doing mkswap on a flash partition. ever.


Top
   
 Post subject:
PostPosted: Thu Aug 12, 2010 6:56 am 
Offline

Joined: Mon Jan 08, 2007 3:40 am
Posts: 195
Location: Pinto, Madrid, Spain
Quote:
i just finished my first successful gprof session.
Lovely read Martin, even if I'm unable to understand almost anything in that Linux jungle some of you love to live...
Quote:
never, ever, for the love your cat, use -c option when doing mkswap on a flash partition. ever.
Swapping on a flash device? Is that ever healthy? Is swapping in itself healthy anyway?

;-)


Top
   
 Post subject:
PostPosted: Thu Aug 12, 2010 10:14 am 
Offline
Site Admin

Joined: Fri Sep 24, 2004 1:39 am
Posts: 1589
Location: Austin, TX
Quote:
Quote:
i just finished my first successful gprof session.
Lovely read Martin, even if I'm unable to understand almost anything in that Linux jungle some of you love to live...
Quote:
never, ever, for the love your cat, use -c option when doing mkswap on a flash partition. ever.
Swapping on a flash device? Is that ever healthy? Is swapping in itself healthy anyway?

;-)
The bigger the flash device the more healthy it is (lots of extra room for marking used cells and finding a new one :)

Swapping should be as safe as doing a lot of compiling. Ideally though, go buy a USB hard drive if you are worried about it.

_________________
Matt Sealey


Top
   
 Post subject:
PostPosted: Thu Aug 12, 2010 12:22 pm 
Offline
Site Admin

Joined: Fri Sep 24, 2004 1:39 am
Posts: 1589
Location: Austin, TX
Quote:
i just finished my first successful gprof session. even though it was both educational and self-embarrassing (i don't recall when was the last time i did so many things wrong in setting up a simple task), it did not bring the much desired enlightenment to this rural corner of the universe. after reading up a bit on what the heck i was doing, i'm practically confident now that gprof takes proper instrumenting of everything that is intended to be seen in the profile. i.e. the presence of debug information alone is insufficient for the profiling of a routine. as a result, nothing outside of my instrumented application code is currently visible in the profile charts. on the plus side, it pinpointed one place in my code as a candidate for optimisation on the cortex a8.

hereupon, i plea for a -pg build (ie. gprof-instrumented) of the es/egl userspace binaries. if possible, Matt, when you have the time.
We'll think about it for sure.. but first, a standard release needs to come out :D

_________________
Matt Sealey


Top
   
 Post subject:
PostPosted: Tue Aug 17, 2010 7:53 pm 
Offline

Joined: Tue Mar 31, 2009 10:24 pm
Posts: 171
Quote:
We'll think about it for sure.. but first, a standard release needs to come out :D
thanks, i appreciate that.
Quote:
Swapping on a flash device? Is that ever healthy? Is swapping in itself healthy anyway?
i use only sterilized flash cards ; )


Top
   
 Post subject:
PostPosted: Tue Aug 24, 2010 9:46 pm 
Offline

Joined: Thu Nov 18, 2004 11:48 am
Posts: 110
oprofile seems working correctly on the efika. You might try to use it.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic  [ 21 posts ] 

All times are UTC-06:00


Who is online

Users browsing this forum: No registered users and 66 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
PowerDeveloper.org: Copyright © 2004-2012, Genesi USA, Inc. The Power Architecture and Power.org wordmarks and the Power and Power.org logos and related marks are trademarks and service marks licensed by Power.org.
All other names and trademarks used are property of their respective owners. Privacy Policy
Powered by phpBB® Forum Software © phpBB Group