Recently I read a long thread at Slashdot discussing about the psotive and negative impacts on the usage of any kind of BadMEM/BadRAM technology. Skipping several flaming emails I can summarize the 260+ KB of text as a good discussion with the strategical Pros and Cons.
Here, I would like to comment on some of the main aspects which I think are mistaken by some writer at Slashdot. Please note that the text below only reflects my personal oppinion on this issue. My intension is NOT to flame at or accuse any author at Slashdot, but to help everybody out there to make up his/her mind on the weightening of the different advantages and disadvantages of BadMEM.
First, I must try to clear a misunderstanding:
Any current usage of BadMEM assumes a static badness of your memory modules.
Having randomly appearing holes (especially if the position of these holes are
altering by the time), you will encounter severe problems, even if you are using
the BadMEM-patch extensively. In the current form, it is an absolute precondition
that you must assume a non-dynamic defect of your RAM. Though we can reduce
this current discussion to static defective RAM, this issue cannot be skipped.
Many authors at Slashdot mangled this basic. The most used argument against
the usage of BadMEM was that you can never really trust a memory module which
already has some bad bits that it will stay in this shape. I think that is only
half of the truth. Let's take two examples: Imagine a module which got a broken
pin during production which was not detected during quality checks by chance;
take another module which got broken by a misproduction in the crystaline structure
leading to temporary data losses at high temperatures. Which of the modules
do you trust more? Can you trust even one of them? Or even further: Can you
trust any bought module at all?
First, you must see that there is no common answer. It is a philosophical question
which must be answered by everyone - including you. It highly depends on your
experiences and your surrounding. If I were an Head Administrator of a big,
centralized company, I would never use questionable RAM modules in my productive
area (it is likely that I will even never have the time to play around with
them in an experimental environment). But if I were a poor student being able
to save several dollars to the exchange of some hours of testing time which
the PC can normally do by night, things look worthwhile again.
I blame this difference of behaviour, habbits and experience to have made the
discussion at Slashdot so emotional. To solve this trap of strategical development
for BadMEM, I would like to suggest the following differentiation before continuing:
(Please note that I do not claim that the list above is complete!) As you can see, this discussion leads towards common economical problems which are often solved by the Rational Decision Theory with its measured Weightening and Decision Functions (if you are a German speaking reader, you might want to have a look at "Unternehmenspolitik" by Kieser/Oechsler, Schäffer/Poeschel, 1999, p. 58f and 70f). In my oppinion it does not make sense to explain all the details of this theory here. The only thing I still would like to mention is the fact that making decision is something that makes us all human beings. Therefore I want to make you think about your own decisions - if you are either pro BadMEM or contra it. While you are speaking against it and damning it to hell, another might see a new era dawning. Is only one oppinion correct?
While many people (at Slashdot) are discussing on the level of personal usage, a minority tried to focus the light on some economical problems. I only read one message talking about the logistical problems: As we can deduce from Table 1, it is likely that every single module has its own characteristics. This makes it complicate to compare. Therefore, adequate standards of comparison must be developed. In the same way as of Table 1, here is a suggestion for the dimensions of classification:
Dimension | Comment |
---|---|
Quantity of Failure | The amount of bad bits in the module should always go into consideration; in the case for static damages, there are already two types of classification: "BadRAM Class" and "BadMEM Type". For further information, please have a look at the BadMEM-4096 Specification and at Rick's Page |
Quality of Failure | Pure statical damages have another basic characteristic as temperature failures and therefore lead to different usabilities. |
The dimensions above mainly aim to a single question: 'How can we interpolate
the probability of a memory failure?'. Please note in this context that memory
might have two (or even more) different types of failures. All of them have
to be taken into consideration.
So, standardization might have a certain impact to the mentioned logistical
problems. Again, the necessary extend of standardization must be proven by the
markets.
Some authors at Slashdot were afraid that money-orientated "business men" could sell BadMEM-PCs as new, 100%-ok PCs, flunking the normal users out there. This economical aspect of the BadMEM-issue simply leads to an philosophical problem: Should you develop something that people might use for the bad? Again, we must ask about the risks and what you can win out of it. In my oppinion it is worthwhile working on BadMEM, as we can take several precautions in the case of misusage:
Anyway, this is no complete security, but 'total security' is always an illusion.
However, the purchase of BadMEM-PCs will likely have a positive impact to the
development of Linux, too: As in the beginning time of this feature, Linux will
be the only Operating System which will work on this sort of PCs. In any case
I want to make clear that I am no friend of this dubious "business man",
because they are not interested in creating wealth, but making money.
At the end you still can mention that the unethnic behaviour of a minority should
not stop the majority of making progress.
Another important economical aspect is the question of warranties. The issue begins to become very complex if you not just think of static damages in the RAM, but dynamic damages. As there is no market for bad memory modules, yet, I want to suggest to skip this problem until then.
The discussion at Slashdot has shown me a common need: The on-the-fly-feature, namely checking suspicious RAM in the idle-time of the Linux Kernel and dynamically allocating "broken" memory, is a central wish of the upcomming BadMEM community. Therefore, I want to make this dream become reality.