User Tools

Site Tools


struct_implementation

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

struct_implementation [2010/05/12 14:02]
zhangvi1
struct_implementation [2010/12/15 15:53]
Line 1: Line 1:
-=======Goal======= 
- 
-store struct member data as hardware RAM modules. 
- 
-For example in dhrystone with union removed: 
-%struct.record = type { %struct.record ​ *, i32, i32, i32, [31 x i8], i32, [31 x i8], i8, i8 } 
- 
-Contains: i8, i32, array of i8 and struct pointer (i32) 
- 
-=======Alternatives======= 
- 
-1: Smallest member addressable RAM 
- 
-84 x i8 (84 bytes) 
- 
-Pros: 
-  * Writing does not require a read 
-Cons: 
-  * Reading, writing i32 takes 4 cycles 
- 
-2: Largest member addressable RAM 
- 
-21 x i32 (84 bytes) 
- 
-Pros: 
-  * Reading any member takes one cycle 
-Cons: 
-  * For i8, writing requires a read so that data is not overwritten 
-  * Extra space may be required 
-  * May read unnecessary data, but this data may be used as a cache 
- 
-3: Entirely addressable RAM 
- 
-1 x i672 (84 bytes) 
- 
-Pros: 
-  * Struct copying is fast 
-Cons: 
-  * Writing always requires a read so that data is not overwritten 
- 
-4: Split into groups by addressability 
- 
-5 x i32, 64 x i8 (84 bytes) 
- 
-Pros: 
-  * Reading and writing any member takes one cycle 
-  * Different blocks can be read/​written in parallel 
-Cons: 
-  * Code complexity 
-    * Need to keep track of each member'​s RAM module and offset 
-    * Must manage each block separately 
- 
-5. One RAM per member 
- 
-i32, i32, i32, i32, i32, 31xi8, 31xi8, i8, i8 (84 bytes) 
- 
-Pros: 
-  * Reading and writing any member takes one cycle 
-  * Simple to code 
-Cons: 
-  * Too many RAMs, allocation restrictions 
- 
-6. Word-addressable RAM 
- 
-21 x i32 (84 bytes) 
- 
-Pros: 
-  * Structs are already word-aligned,​ so this makes memcpy, memset much easier 
-  * Easy to implement 
-Cons: 
-  * Need to read before write possibly 
-  * May waste space 
- 
-7. Word-addressable RAM x2 (if word-address % 2 = 0, store in ram 1, otherwise store in ram 2) 
- 
-11 x i32 + 10 x i32 (84 bytes) 
- 
-Pros: 
-  * Structs are already word-aligned,​ so this makes memcpy, memset easier 
-  * memcpy and memset can be performed at twice the speed, copying to both rams in the same cycle 
-  * With two RAM's, i64 can be read/​written in one cycle 
-Cons: 
-  * Pointer dereferencing is more complicated for i64 since it must load/store from both RAM's at the same time 
-  * Pointer dereferencing otherwise must pick which RAM to read/write from 
- 
-8. Smallest member addressable RAM x largest member size / smallest member size 
- 
-21 x i8, 21 x i8, 21 x i8, 21 x i8 (84 bytes) 
- 
-Pros: 
-  * All data types can be read/​written in one cycle 
-  * Similar storage geometry to default word alignment 
-  * memcpy and memset can be performed faster depending on number of RAM's 
-Cons: 
-  * memcpy and memset will have to be modified to work 
-  * Pointer dereferencing otherwise must pick which RAM to read/write from 
-  * Uses more RAM's 
-  * More complex to program 
- 
-=======Other======= 
-Arrays of Structs: 
-  * Treat like multi-dimensional arrays 
- 
-Arrays of Structs, Structs of Arrays, Structs of Structs...: 
-  * Must be done recursively or with a stack 
- 
-Struct Alignment: 
-  * Structs are declared as align 8 by llvm-gcc, but seem to be internally align 4, so a char proceeded by an int takes up 8 bytes (3 unused). char, int, char takes up 12 bytes, however char, char, int uses 8. 
- 
-=======Instructions To Implement======= 
- 
-alloca: alloca %struct.conglomerate,​ align 8 
- 
-getelementptr:​ getelementptr inbounds %struct.conglomerate* %r, i32 0, i32 3 
- 
-memcpy: call void @llvm.memcpy.i32(i8* %r22, i8* %r1, i32 124, i32 8) 
- 
-memset: call void @llvm.memset.i32(i8* %r1, i8 0, i32 124, i32 8) 
- 
-load: load i32* %5, align 4 
- 
-store: store i8 97, i8* %r1, align 8 
- 
-bitcast: bitcast %struct.record* %1 to i8* 
- 
-ptrtoint: %78 = ptrtoint %struct.record* %77 to i32 
- 
-=======alloca======= 
-Find space required for the struct, figure out alignment size 
- 
-=======getelementptr======= 
-Make sure pointer returned is aligned with the same size as the struct 
- 
-=======memcpy/​memset======= 
-memcpy/​memset to the struct alignment 
- 
-=======load======= 
-May take more than one cycle, depending on size 
- 
-=======store======= 
-May need to read prior, may need more than one cycle to store, depending on size 
- 
-=======bitcast======= 
-May not need to modify 
- 
-=======ptrtoint======= 
-May not need to modify 
  
struct_implementation.txt ยท Last modified: 2010/12/15 15:53 (external edit)