Failure mechanisms in semiconductors


Inherently reliable?

In a previous edition of Practical Reliability Engineering, Patrick O’Connor made the statement that ‘For hermetically sealed semiconductor devices, there are no inherent wear-out failure modes. That is, there are no failure mechanisms which depend upon operating or non-operating time, within a correctly manufactured device.’

This very positive statement about semiconductor devices gives false assurance. We know from experience that sometimes devices do fail, that components on printed circuit assemblies occasionally have to be replaced at functional test, and that equipment in the field does develop faults in use due to semiconductor failure.

The reason for this apparent contradiction lies partly in the qualifications – hermetically sealed devices, correctly manufactured – and partly in the implicit assumption that the device is being used correctly. O’Connor went on to remark that a device can fail if it is overloaded beyond its design rating (temperature, voltage or current) or if there is a defect which causes immediate or progressive weakening.

So, in practice, failure can occur because of faults in the ways in which the parts are both made and used. IC reliability as supplied is dependent on quality control of the manufacturing processes and on the effectiveness of the screening techniques used to remove defective devices: IC reliability in use depends on the fitness of the part for its intended application.

There is a further requirement, positioned between supply and use, for correct handling and processing of the semiconductor component by the board assembler. As will be shown later, mechanical, thermal and electrical over-stresses applied (often unintentionally) during assembly and test can cause failure either in-process or (worse) during life.

Activity

Reflect on what you know about the construction and materials of a plastic-packaged semiconductor – if necessary, reread Semiconductor packages. What are the possible ways in which such a device might fail?

Compare your answer with the description that follows, especially Figure 1.




Selected failure modes and causes

Table 1 summarises the more common internal failure modes which apply to semiconductor integrated circuits. Some of these relate to over-stress in the application, whilst others are caused by manufacturing faults or environmental conditions. Figure 1 shows typical failure sites related to the plastic packaging of an integrated circuit.

Table 1: Microcircuit device failure modes

adapted from O’Connor 2002

Failure mode Caused by Prevention
On-chip open circuit Electro-migration (bulk movement of aluminium conductor track material due to electron flow) Limit current density and operating temperature; quality control of metallisation process
Current over-stress Circuit protection;
ESD control
Corrosion of tracks due to moisture ingress Quality control of passivation and packaging
Wire bond open circuit Broken/lifted wire bond; corrosion; intermetallic growth (‘purple plague’) Quality control of bonding and packaging
On-chip short circuit Voids in dielectric (passivation) layers Quality control of passivation
Voltage over-stress of dielectric Circuit protection;
ESD control
Inclusions in package Quality control of packaging
Incorrect transistor action Bulk silicon or oxide defects; mask misalignment; impurities; inclusions Quality control of processes
‘Hot electrons’; g radiation (space, military) Selection of ‘hard’ technology such as silicon-on-sapphire
‘Latch up’ (destruction of CMOS transistors due to internal positive feedback) Transient current over-stress Internal and external protection circuits; ESD control
Data corruption (‘soft’ errors) a-particle emissions from package material; Electromagnetic interference (EMI) Polyimide coating; EMI protection; software techniques

Figure 1: Potential failure sites on a polymer-encapsulated integrated circuit

Figure 1: Potential failure sites on a polymer-encapsulated integrated circuit

after Lau 1997

Both temperature cycling and high temperature operation can also affect reliability by accelerating the onset of failure due to defects. In general, temperature cycling accelerates stress-related failure, and elevated temperature reduces reliability where diffusion and oxidation are the predominant failure mechanisms. However, semiconductor reliability will not be markedly affected, provided that:



Improving reliability

Hermetic packages are expensive, and often relatively large and mechanically fragile. This led to the development in the 1970s of much cheaper polymer encapsulated equivalents. In these the silicon die is mounted on a free-standing lead-frame and the assembly then encapsulated in resin, usually by transfer-moulding.

Early devices exhibited high failure rates, of around 100 failures per million device hours, mainly due to:

These problems have been overcome, to create protected devices which are reliable enough for most applications. A number of major improvements have contributed to this:

As a result, the early life failure rate of plastic packaged microcircuits has been reduced by orders of magnitude, although the failure mechanisms have not been totally eliminated. Reported failure rates depend both on device selection and on the end use, and range from typically 0.1–0.7 failures per million device hours for parts in automotive environments to less than 0.001 in computers and test applications.

It is important to appreciate that, despite the many ways by which microelectronic devices can fail, and their great complexity, modern manufacturing processes provide very high quality levels. As a result, only a small proportion of modern electronic system failures are due to the inherent failure of microelectronic devices themselves. Most problems are caused by not taking enough care in system design and use, and failing to ensure adequate protection against externally-induced failures during board assembly. This topic is explored in some depth in the sections which follow, especially in those which discuss ESD-induced failures.

Self Assessment Questions

One of your colleagues believes that semiconductors are reliable; another that they are the reason for most of your company’s customer returns. Their argument is getting heated, so you are asked to intervene! Use this as an opportunity to share with them your insights as to how reliable a semiconductor is, and how the unreliability of a device is affected by its manufacture, assembly and application.

compare your answer with this one