GTD190025:【翻译】The Lost Art of C Structure Packing
http://www.catb.org/esr/structure-packing/
Eric S. Raymond
Table of Contents
- 1. Who should read this
- 2. Why I wrote it
- 3. Alignment requirements
- 4. Padding
- 5. Structure alignment and padding
- 6. Bitfields
- 7. Structure reordering
- 8. Awkward scalar cases
- 9. Readability and cache locality
- 10. Other packing techniques
- 11. Overriding alignment rules
- 12. Tools
- 13. Proof and exceptional cases
- 14. Supporting this work
- 15. Related Reading
- 16. Version history
cvs-fast-export and the problem was that it was dying with out-of-memory errors on large repositories.
There are ways to reduce memory usage significantly in situations like this, by rearranging the order of structure members in careful ways. This can lead to dramatic gains - in my case I was able to cut the working-set size by around 40%, enabling the program to handle much larger repositories without dying.
But as I worked, and thought about what I was doing, it began to dawn on me that the technique I was using has been more than half forgotten in these latter days. A little web research confirmed that C programmers don’t seem to talk about it much any more, at least not where a search engine can see them. A couple of Wikipedia entries touch the topic, but I found nobody who covered it comprehensively.
There are actually reasons for this that aren’t stupid. CS courses (rightly) steer people away from micro-optimization towards finding better algorithms. The plunging price of machine resources has made squeezing memory usage less necessary. And the way hackers used to learn how to do it back in the day was by bumping their noses on strange hardware architectures - a less common experience now.
But the technique still has value in important situations, and will as long as memory is finite. This document is intended to save C programmers from having to rediscover the technique, so they can concentrate effort on more important things.
barrel shifters) have had more restrictive ones. If you do embedded systems, you might trip over one of these lurking in the underbrush. Be aware this is possible.
From when it was written at the beginning of 2014 until late 2016 this section ended with the last paragraph. During that period I’ve learned something rather reassuring from working with the source code for the reference implementation of NTP. It does packet analysis by reading packets off the wire directly into memory that the rest of the code sees as a struct, relying on the assumption of minimal self-aligned padding.
The interesting news is that NTP has apparently being getting away with this for decades across a very wide span of hardware, operating systems, and compilers, including not just Unixes but under Windows variants as well. This suggests that platforms with padding rules other than self-alignment are either nonexistent or confined to such specialized niches that they’re never either NTP servers or clients.
more information.
I have not used it myself, but several respondents speak well of a program called pahole
. This tool cooperates with a compiler to produce reports on your structures that describe padding, alignment, and cache line boundaries.
I’ve received a report that a proprietary code auditing tool called "PVS Studio" can detect structure-packing opportunities.
packtest.c.
If you look through enough strange combinations of compilers, options, and unusual hardware, you will find exceptions to some of the rules I have described. They get more common as you go back in time to older processor designs.
The next level beyond knowing these rules is knowing how and when to expect that they will be broken. In the years when I learned them (the early 1980s) we spoke of people who didn’t get this as victims of "all-the-world’s-a-VAX syndrome". Remember that not all the world is a PC.
Patreon feed. The time needed to write and maintain documents like this one is not free, and while I enjoying giving them to the world my bills won’t pay themselves. Even a few dollars a month - from enough of you - helps a lot.
A Guide to Undefined Behavior in C and C++
Time, Clock, and Calendar Programming In C
Things Every Hacker Once Knew
16. Version history
- 1.18 @ 2017-06-01
- More general zero-padding orders. C11 and C14 relax a contstraint on bitfield packing.
- 1.17 @ 2016-11-14
- Typo fixes.
- 1.16 @ 2016-10-21
- Answer an objection about allocation order being unrelated to source order.
- 1.15 @ 2016-10-20
- Note the field evidence from NTP.
- 1.14 @ 2015-12-19
- Typo correction: -Wpadding → -Wpadded.
- 1.13 @ 2015-11-23
- Be explicit about padding bits being undefined. More about bitfields.
- 1.12 @ 2015-11-11
- Major revision of section on bitfields reflecting C99 rules.
- 1.11 @ 2015-07-23
- Mention the clang -fdump-record-layouts option.
- 1.10 @ 2015-02-20
- Mention attributepacked, -fpack-struct, and PVS Studio.
- 1.9 @ 2014-10-01
- Added link to "Time, Clock, and Calendar Programming In C".
- 1.8 @ 2014-05-20
- Improved explanation for the bitfield examples,
- 1.7 @ 2014-05-17
- Correct a minor error in the description of the layout of
struct foo8
. - 1.6 @ 2014-05-14
- Emphasize that bitfields cannot cross word boundaries. Idea from Dale Gulledge.
- 1.5 @ 2014-01-13
- Explain why structure member reordering is not done automatically.
- 1.4 @ 2014-01-04
- A note about double under x86 Linux.
- 1.3 @ 2014-01-03
- New sections on awkward scalar cases, readability and cache locality, and tools.
- 1.2 @ 2014-01-02
- Correct an erroneous address calculation.
- 1.1 @ 2014-01-01
- Explain why aligned accesses are faster. Mention offsetof. Various minor fixes, including the packtest.c download link.
- 1.0 @ 2014-01-01
- Initial release.