Power Developer • GCC compile brainstorming

Unanswered topics | Active topics

Board index » »

All times are UTC-06:00

GCC compile brainstorming

Post new topic Reply to topic

Page 1 of 1

[ 9 posts ]

Print view

Previous topic | Next topic

Author

Message

gunnar

Post subject: GCC compile brainstorming

PostPosted: Mon Apr 14, 2008 8:42 am

Offline

Joined: Tue Nov 02, 2004 2:11 am
Posts: 161

Do you think it makes sense to post here some C-code snippes together with the ASM instruction compiled by GCC as examples on how GCC operates?

The examples could be used for brainstorming and to identify patterns in the behavior of GCC.

Cheers
Gunnar

Top

Profile

Reply with quote

markos

Post subject: Re: GCC compile brainstorming

PostPosted: Mon Apr 14, 2008 8:55 am

Offline

Joined: Wed Oct 13, 2004 7:26 am
Posts: 348

Quote:

Hi Gunnar,

however important I think the discussion here is, it is much more important to be done in the proper place, ie in the gcc bugtracker and gcc mailing lists. It's much more probable to be fixed there by the right persons, and even the Freescale/CodeSourcery guys are probably following these. Of course if you don't mind, it would be interesting to read these here as well :)

Regards

Konstantinos

Top

Profile

Reply with quote

Neko

Post subject: Re: GCC compile brainstorming

PostPosted: Mon Apr 14, 2008 7:39 pm

Offline

Site Admin

Joined: Fri Sep 24, 2004 1:39 am
Posts: 1594
Location: Austin, TX

Quote:

however important I think the discussion here is, it is much more important to be done in the proper place, ie in the gcc bugtracker and gcc mailing lists. It's much more probable to be fixed there by the right persons, and even the Freescale/CodeSourcery guys are probably following these. Of course if you don't mind, it would be interesting to read these here as well :)

I think the discussion is very relevant here :)

However I do think the compiler performance shouldn't be Gunnar's goal here. We're talking about mimicking a 68k processor on ColdFire for a specific application. At this point a 200MB/s bus bandwidth for read is about 3x more than he would expect from a 68060 with EDO SDRAM with a 60ns access time. Certain versions of the GCC compiler generate adequate - if not performance - code (later versions, seemingly, do not). We know CodeWarrior and DIAB and GreenHills do better. In the end, mimicking the m68k does not rely on the compiler but the technique used, of which - as he has very competantly explained in his project and elsewhere - could be one of 3 or 4 or 5 different ways (perhaps a QEMU/UAE style virtual machine, or an instruction trap mechanism as with the 68000 fpsp or 68040/68060.library mechanisms on AmigaOS, or something like ShapeShifter/Sheepshaver MacOS emulation on the Amiga, where 90% of the instructions are run native but important differences are emulated for the purpose of seperation of operating systems).

This is where the important work lies. Redefining the operation of GCC4 is a waste of time. You can code the emulator now, find the best method, and fix GCC later so that it enables compilation and linking of the best method with the least amount of manual hacking. But compiler reworks are the last resort - until, that is, you hit a compiler bug that refuses to generate working code or causes exceptions or doesn't even compile, THEN it is something worth fixing :)

_________________
Matt Sealey

Top

Profile

Reply with quote

markos

Post subject: Re: GCC compile brainstorming

PostPosted: Mon Apr 14, 2008 11:52 pm

Offline

Joined: Wed Oct 13, 2004 7:26 am
Posts: 348

I agree that the discussion is relevant about Gunnar's project and any details around it. My point was about GCC bugs that would get too technical. It's not that *shouldn't* be here, it's that they should be on GCC bugtracker *too*! :)

After all, for any project most bugs are found elsewhere rather than the bugtracker/mailing lists and then filed as bug reports upstream.

In any case, don't mind me, please continue, it was an interesting read anyway :)

Konstantinos

Top

Profile

Reply with quote

gunnar

Post subject: Re: GCC compile brainstorming

PostPosted: Tue Apr 15, 2008 2:22 am

Offline

Joined: Tue Nov 02, 2004 2:11 am
Posts: 161

Hi Matt,

Quote:

At this point a 200MB/s bus bandwidth for read is about 3x more than he would expect from a 68060 with EDO SDRAM with a 60ns access time.

So far I have only measured 120 MB/sec read for the V4m.
You can find a comparison of 680x0 and V4m results here:
http://www.powerdeveloper.org/forums/vi ... 0621#10621

I hope this helps you.

Top

Profile

Reply with quote

gunnar

Post subject:

PostPosted: Tue Apr 15, 2008 3:25 am

Offline

Joined: Tue Nov 02, 2004 2:11 am
Posts: 161

One GCC example:

C-source

Code:

void * copy_32x4a(void *destparam, const void *srcparam, size_t size)

{

        int *dest = destparam;

        const int *src = srcparam;

        int size32;

        size32 = size / 16;

        for (; size32; size32--) {

                *dest++ = *src++;

                *dest++ = *src++;

                *dest++ = *src++;

                *dest++ = *src++;

        }

}

Compile option: m68k-linux-gnu-gcc -mcpu=54455 -msoft-float -o example -Os -fomit-frame-pointer example.c
We use -Os to focus on compact code.

Generated code:

Code:

      202f 000c       movel %sp@(12),%d0

      226f 0004       moveal %sp@(4),%a1

0c:       206f 0008       moveal %sp@(8),%a0

      e888            lsrl #4,%d0

      6022            bras 36 

      2290            movel %a0@,%a1@

      2368 0004 0004  movel %a0@(4),%a1@(4)

1c:       2368 0008 0008  movel %a0@(8),%a1@(8)

      2368 000c 000c  movel %a0@(12),%a1@(12)

      d3fc 0000 0010  addal #16,%a1

2e:       d1fc 0000 0010  addal #16,%a0

      5380            subql #1,%d0

      4a80            tstl %d0

      66da            bnes 14 

3a:       4e75            rts

Code length produced by GCC = 56 Byte
Length of workloop = 9 instructions , 38 Byte

Expected code:

Code:

04:       202f 000c       movel %sp@(12),%d0

08:       226f 0004       moveal %sp@(4),%a1

0c:       206f 0008       moveal %sp@(8),%a0

10:       e888            lsrl #4,%d0

12:       6022            beq 20 

14:       20d9            movel %a1@+,%a0@+

16:       20d9            movel %a1@+,%a0@+

18:       20d9            movel %a1@+,%a0@+

1a:       20d9            movel %a1@+,%a0@+

1c:       5380            subql #1,%d0

1e:       66da            bnes 14 

20:       4e75            rts

Expected code length = 30 Byte
Length of workloop = 6 instructions , 12 Byte

Issue 1:
Why does GCC not use the ConditionCodes already set by the 68k instruction but generates a unneeded test.l?

Issue 2:
Why does GCC not use the much more efficient (an)+ adressing mode but uses instead d(an) mode plus an extra add instrcution ?
The (Ad)+,(Am)+ instruction is 2 Bytes instead of 6 Bytes.
And (Ad)+,(Am)+ does not need the extra two instructions to increment the pointers.

Issue 3:
Assuming that GCC decided to increment a pointer manually.
Why does GCC use addil to increment a pointer?
LEA should be the better choice for this as its 2 bytes shorter than addi.l

Last edited by gunnar on Tue Apr 15, 2008 4:06 am, edited 1 time in total.

Top

Profile

Reply with quote

gunnar

Post subject:

PostPosted: Tue Apr 15, 2008 4:04 am

Offline

Joined: Tue Nov 02, 2004 2:11 am
Posts: 161

Second example:

C-Code:

Code:

void * write_32x4(void *destparam, const void *srcparam, size_t size)

{

        int  value=1;

        int *dst = destparam;

        size = size / 16;

        for (; size; size--) {

             *dst++=value;

             *dst++=value;

             *dst++=value;

             *dst++=value;

        }

}

Generated output

Code:

<write_32x4>:

0a:       202f 000c       movel %sp@(12),%d0

0e:       206f 0004       moveal %sp@(4),%a0

12:       e888            lsrl #4,%d0

14:       601c            bras 32 

16:       20bc 0000 0001  movel #1,%a0@

1c:       7201            moveq #1,%d1

1e:       2141 0004       movel %d1,%a0@(4)

22:       2141 0008       movel %d1,%a0@(8)

26:       2141 000c       movel %d1,%a0@(12)

2a:       d1fc 0000 0010  addal #16,%a0

30:       5380            subql #1,%d0

32:       4a80            tstl %d0

34:       66e0            bnes 16 

36:       4e75            rts

Generated code length = 46 Byte
Length of Workloop: 9 instructions, 32 byte

The expected result would be

Code:

<write_32x4>:

0a:       202f 000c       movel %sp@(12),%d0

0e:       206f 0004       moveal %sp@(4),%a0

12:       7201            moveq #1,%d1

14:       e888            lsrl #4,%d0

16:       601c            beqs 24 

18:       21c0            movel %d1,%a1@+

1a:       21c0            movel %d1,%a1@+

1c:       21c0            movel %d1,%a1@+

1e:       21c0            movel %d1,%a1@+

20:       5380            subql #1,%d0

22:       66e0            bnes 18 

24:       4e75            rts

Expected code length = 28 Byte
Length of Workloop: 6 instructions, 12 byte

Issue 4:
We see again the unneeded TST instruction.

Issue 5:
The Compiler again uses a much bigger and slower addressing mode.

Issue 6:
The preload of the work value into register D1 is done inside the work loop. The should be done outside of the main workloop.

Issue 7:
The compiler decides to put the literal work value #1 into the work register D1. But its not always using this work register, one time it uses a literal move.l #1, and thereby unneeded increasing the code by 4 bytes.

Both GCC 4 examples have 9 instruction inside the workloop. Older GCC would solve the same task using a workloop of only 6 instructions.
Generelly the new GCC 4 code is bigger and a lot slower than before.

Top

Profile

Reply with quote

gunnar

Post subject:

PostPosted: Tue May 06, 2008 1:25 am

Offline

Joined: Tue Nov 02, 2004 2:11 am
Posts: 161

To help improve the Code generated by GCC for Coldfire/68K, I've filed the bugs 36133, 36134, 36135, and 36136 to the GCC-Compiler.

Top

Profile

Reply with quote

kaltst

Post subject: gcc 3.4?

PostPosted: Tue May 06, 2008 2:24 am

Offline

Joined: Tue Nov 02, 2004 6:17 am
Posts: 28

Hi Gunnar,

do you know by chance, if the gcc 3.4 code suffers of the same problems?

Top

Profile

Reply with quote

Post new topic Reply to topic

Page 1 of 1

[ 9 posts ]

Board index » »

All times are UTC-06:00

Who is online

Users browsing this forum: No registered users and 0 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum