Defining and Impacting Throughput of Automation Equipment - It's Not As Straightforward as it Sounds

You became an engineer, not a lawyer, so you probably don’t really like to think about contracts and contractual obligations. But, even the most elegant designs are not very useful if they don’t satisfy a business objective, and business objectives are measured in revenue (output), costs (capital and operating), profits, and risks. While you may not be involved in negotiations on penalty clauses, bank guarantees, warranty coverage, etc, the one thing that equipment design directly impacts is throughput, which will be one of the main buy-off criteria, so on that front, you really do need to pay very close attention to the contract. Unfortunately, a lot of the terminology that ends up in specifications and quotations is imprecise or sometimes, even conflicting or misused, so it’s important to take the time to understand exactly what is expected and what your design can do to meet that expectation. In this post, I will try to help you navigate through some of the terminology and what it means to you as a designer or project manager, which will hopefully give you some background to ask the right questions to clarify expectations.

(More than) A Few Definitions……

Takt Time

Takt Time is often used interchangeably with Net Cycle Time, but that is a bit of a misnomer. Takt time is the pace that a line must produce a part to satisfy customer demand, so it varies with that demand, whereas cycle time is more the maximum rate that a line can produce parts irrespective of demand. So, yes, they are related in that one is a limit to the other….if the cycle time of the equipment is 10s, you cannot have a takt time that is faster than 10s….but they are technically not the same thing in production. Since specifications for equipment are generally written to satisfy a given volume, the mixing of the terms isn’t that big of a deal, but some customers will ask for solutions that allow for varying output or varying model mixes, so understanding the real definition can be beneficial in those conversations.

Cycle Time

Despite just using this term a couple of times in the previous paragraph, I would argue that except for very casual, non-technical conversations, “cycle time” should not be used as a term to define throughput of a line and should be avoided in any meaningful discussions. It is imprecise and open to interpretation and will likely create confusion.

Equipment or Automation Cycle Time

This term is a bit of a holdover from the olden days, but worth mentioning. There was a time when the customer base maintained large staffs of industrial engineers whose entire job was to define what a “reasonable” operator/manual cycle time was. They took responsibility for that part of the cycle, so all that the machine builders were held accountable for was the automation portion of the machine. This was contractually the cleanest measure of throughput for us since we have no control over the training, skill, and motivation of the operators, but over time, responsibility for the entire content has been pushed to the equipment supplier, so you likely won’t see this as a standalone final acceptance criteria, although, it is good to keep the term alive because in a dispute, proving that the equipment is capable on its own is an important part of the conversation.

Efficiency

Efficiency is usually defined as the actual output of a line versus the theoretical maximum output of a line. When used as a standalone definition, it’s normally calculated by dividing the number of parts produced (good or bad) over time (the time window should be fairly long relative to the target cycle time…..so multiple hours or a shift is typical) by total parts that should have been made in that period, excluding any scheduled downtime such as breaks, maintenance, etc, and then multiplying by 100 and representing the number as a percentage. Sometimes, the definition is modified slightly to only include good parts made, so there is some ambiguity in the term, but for the most part, it’s fairly widely accepted. Typically achievable/acceptable numbers are anywhere from 80%-95% depending on industry, complexity, etc.

Design Cycle Time

This is the target short term cycle time of the line excluding any downtime, inefficiencies, etc. It is not a widely used term, but I think still has some value because it clearly distinguishes the goal versus measured results. It is the same as Target Gross Cycle Time, so feel free to use either.

Gross Cycle Time (Target or Measured)

This is the short term cycle time of the line excluding any downtime, inefficiencies, etc. It’s an easy concept to grasp, but a little harder to measure in reality because those inefficiencies and downtimes do exist. It’s best if the measurement method for this is established upfront in the quotation, but at the very least, it should be nailed down prior to runoff. The best thing I’ve seen is that you pick a relatively small number of cycles…..maybe only ten or twenty….so that you can ensure that it is a trouble free test, take the overall time to produce those parts and then divide by that number of parts to get the average gross cycle time. Sometimes, demonstrating this number to the customer isn’t even a runoff requirement, but it’s absolutely essential data to have in case you have to dig deeper into a throughput issue, so never complete debug without documenting it.

Net Cycle Time (Target or Measured)

Net cycle time is simply gross cycle time multiplied by efficiency, so it is the output of the line over time during scheduled production. It was a fairly common criteria for passing a runoff that qualifies the equipment, that has since been largely superseded by meeting an OEE target (explained below).

First Time Yield/First Pass Yield

This is a quality measure that is derived by dividing the total number of good parts made by the total number of parts x 100 and represented as a percentage. As you would expect, acceptable values for this are typically >99%. One cautionary note about this being used as acceptance criteria for your equipment: you need to really analyze the root causes of any “reject” parts that take away from the numerator of the calculation. Especially in assembly equipment, many times, the failure mode is not driven by the equipment, but by the underlying part geometry. A failed press force creates a reject part. But is it because the hole was .002” oversize? Or is it because the press is going too fast? Failed torque because of tight threads or an incorrectly set up torque profile in the equipment? It may be tedious to dig all this out, but again, usually only one or two rejected parts are enough to disqualify a runoff, so anything beyond the control of the equipment should be mathematically excluded (subtracted from the numerator and denominator) when doing the calculation. And effects of that exclusion need to be looked at on the throughput calculation, too (subtracted from the theoretical parts produced).

Overall Equipment Effectiveness (OEE)

This has become the holy grail for automotive manufacturers and it is catching on in other industries, too. It combines elements discussed above into one measurement of the line’s ability to produce parts. It has three terms to the calculation:

Availability - the amount of time that the equipment is available for production during planned production time. So, if a shift is 8 hours and there is a half hour lunch, two ten minute breaks and fifteen minutes of cleanup and preventative maintenance, the planned production time is 415 minutes/shift. If your equipment was down (not able to run) for 20 minutes due to a switch that failed and needed replaced, that means your availability for that shift was (415-20)/415 x 100, or 95.2%.
Performance - the “efficiency” described above. This takes into account small stoppages, slow cycles, non-cyclical restocking that prevents production, etc. So, if the design cycle time of the line is 10 seconds and the average cycle time while the machine was available was 11 seconds, that term would be 10/11x100, or 91%. Since that can be difficult to measure, sometimes “parts in the box” is used. If the line was available for the 395 minutes in the shift above (415-20), and the design cycle time was 10 seconds, theoretically, you should have produced 2,370 parts. Let’s say you created 3 rejects, so those would be excluded (they get counted in the next term), so the denominator is 2,367. If you actually produced 2,151 good parts, your performance is 2,151/2,367 x 100 or 90.9%. Side note, this term is almost always capped at 100%, so you can’t intentionally design the line faster than the theoretical need just to pass runoff, although there are places where you may want to do just that. We’ll discuss that more below.
Quality - this incorporates the First time Yield calculation as the third term. So, to continue our example, we made 2,151 good parts and three rejects in our make believe shift. The quality term would then calculate to 2,151/(2,151 + 3) x 100, or 99.9%

With the above example, the OEE would calculate to 0.952 x 0.909 x 0.999 x 100 or 86.5%. Up until about 5-6 years ago, 85% was a widely accepted industry target, but that has crept up to 90% or even 95% in many requirements. I’ll explain below why that is a much bigger deal than it may appear on the surface.

Mean Time Between Failure (MTBF) and Mean Time to Repair (MTTR)

Although outside the scope of a post on throughput, I thought I’d toss these two terms in here just because you’re likely to encounter them and maybe have to estimate them for some customers. The terms are pretty self-explanatory - on average, how long can I expect to go between downtime events (MTBF) and when they do happen, what’s the average amount of time that I’ll be down (MTTR). Perfectly logical questions and valuable tools when analyzing production data, but almost impossible to calculate upfront. If you produce a standard product, you may get real world data to average and publish, but if you do custom equipment, it’s a total crap shoot. The theory is pretty simple…..take a list of all the components on a system, determine how long each will last, then aggregate that mathematically into a model that shows that interaction between those parts. There are two practical problems with the fairly simple mathematical modeling. First, data from manufacturers varies wildly. I remember looking up the MTBF for a low cost, non-repairable (throwaway) cylinder and a large profile rail bearing. The published MTBF data for the non-repairable cylinder was twice that of the bearing, even though anyone in our industry would say that is abjectly backwards, and maybe even off in the wrong direction by a factor of 10. Second, the number is lab derived…..if I hang a significantly off-center load from that profile bearing versus directly over the saddle, the actual number changes drastically, but all you get is the one number from the manufacturer. Between those two factors and the fact that there are literally thousands (or even tens of thousands) of parts on an assembly line, it makes doing the actual calculation virtually useless. Then, throw in the guessing on repair times, and you have a guess piled on top of a guess. Luckily, most customers have recognized this and stopped asking for the formal calculations, but back in the 90s and early aughts, it was routinely required that you provide the numbers and it was a part of the negotiations for the order. But, regardless of all that, just remember that “availability” term in the OEE calculation. You pretty much can’t make a runoff if you have any kind of unplanned downtime and even if you do survive runoff, there is nothing that destroys a customer relationship faster than a nagging production problem, so if you end up with a marginal solution, don’t put a bandaid on it. It will come back to bite you.

Applying the Definitions to Your Designs……

So why does this matter to you? Well, it can impact how you look at your machine design, so it’s best to understand what has been specified, what has been quoted, and how the machine/line will be accepted before you choose your solution and head down the design path. Here are a few things to think about.

There is no “right” number for OEE/Efficiency, But there are plenty of wrong ones specified

As I said above, the “industry standard” for OEE was 85%, but has slowly crept upwards to 90% and even 95%. Is that a problem for machine builders? Well, I can say that it is virtually always more costly to achieve a higher number, but it’s not necessarily problematic. What is problematic is that the target is basically hardcoded into every specification without any discussion or debate about the achievability of it or the cost of achieving it. Let’s look at the math for a line…. the number of stations you have directly impacts the ability to hit the overall goal by the following formula OEEs=(OEEi)^n where OEEs is the OEE of the system, OEEi is the OEE of each station (assuming the targets are all the same…the more generalized formula is OEE1 * OEE2 * OEE3 * …. * OEEn) and n is the number of stations on the line. So, if I have a 5 station line, in order to achieve a 90% system OEE, each station must achieve 97.9% (the fifth root of .90). If that same line has 10 stations, each station must hit 98.95%, which is a significant difference as I’ll explain in a second. On the other hand, you will also receive specifications for single stations that still list that same 90% OEE target. To put that all in context, if you must complete a four hour run at 90% OEE, in the five station line, each station can generate about 5 minutes of lost production (via the combination of downtime plus small production losses) out of four hours (4 x 60 x (1-.979)). In the second case, each station can only generate 2.5 minutes of lost production in four hours. And in the final case, the single station could be down for a whopping 24 minutes and still achieve the goal. That’s the danger of a one size fits all answer without discussion and agreement on what the numbers mean.

The type of equipment also has a significant effect. A lean cell with very limited automation and high levels of operator content typically has a higher potential OEE (assuming a motivated and well trained workforce) because people can “sprint” for short periods to make up for things that go wrong whereas automation keeps the same cadence and losses remain losses. A dial or precision link machine is 100% down with no ability to make things up if one station goes down for even a very short time, while a non-synchronous line can make up for small downtime events with no overall loss if the downtime occurs in the faster stations. If you have no buffers versus strategic buffers, that has an effect. Feeder bowls are notorious OEE sucks, even when they are built by highly qualified companies. Bent parts jam, some parts tangle, tuning changes over time. Complexity is typically non-linear, meaning an eight motion station is not usually twice as complex as a four motion station, but more like 2.5-3x more complex due to the interactions between the motions, packaging them into limited space, etc. So, all of this has to be kept in mind when you sign up for and then have to design a system to meet a given OEE. A few things will be discussed in the next section that have the potential to help you achieve the results you need.

Things to Think About To Maximize OEE

Spend the time to do an OEE risk analysis on each process. What could cause stoppages or downtime? Automatic parts feeding, close tolerance fits, parts with cast or molded features that must be used for nesting and location, etc. Anecdotally, I would guess considerably more than 50% of all production stoppages are because the equipment is not forgiving enough to handle all the product variation that exists. Engineers tend to be “nominal” thinking people. But production is messy, and equipment performance degrades over time, so even try to accommodate part variation outside of tolerance as long as it doesn’t impact cost. Yes, you’re within your rights to argue the machine isn’t responsible to turn bad parts into good, but it’s never a fun argument, and at best, involves a lot of parts sorting and measuring to prove your point. Again, not saying to add a bunch of cost. Just saying think about what you can do within the budget to make the machine care a little less about what it’s being fed.
Gravity may be a constant, but it doesn’t work consistently. Friction almost always beats gravity over time. So, things like gravity feed conveyors, counterweighted motions, lift assists, etc will start to cause downtime unless they are very well thought out. And I mean really, really pay attention to gravity feed conveyors. Make sure you take into account center of gravity of the parts, geometry of the surface that’s riding on the conveyor and the geometry of the conveyor rollers/surface, how it works fully loaded, how it works empty (it will be drastically different), starting parts moving from a dead stop, stopping them from full speed. I’ve spent a lot of time debugging gravity conveyors and have ripped a few out and replaced them with powered. They can be effective, but they can also turn into a nightmare.
While a customer typically won’t allow you to “get credit” in the calculation for a faster cycle time than the line rate, no multi-station line has perfectly balanced station cycle times. If you can group tasks such that the highest risk operations also happen to be on the faster stations, you’ll benefit on lean cells and non-synchronous conveyor lines (won’t matter on dials and precision links), especially if you also can take advantage of the next two bullets.
Make sure you buffer appropriately on lean cells and non-synchronous conveyor systems. Buffer size on a non-sync conveyor varies by cycle time and distances that pallets have to travel, but rarely would you have less than one pallet at a pre-stop, most of the times, two (I’ve seen 5-6 on lines that run at high speed), plus extra pallets for longer runs, corners, elevators, etc. Too many pallets makes the line run almost synchronously and actually hurt your throughput, but the right amount really can make up for small downtime situations on the faster running stations. And I know, using the words buffer and lean cell in the same sentence is not considered pure (buffers are literally one of the wastes listed in all the books), but I think over the last 10-15 years, even the most ardent lean advocates have started believing in small (maybe only one part) strategic buffers/kanbans to smooth the small variations any line will see.
Plan out the recovery steps for high frequency/quickly corrected stoppages. If it can be made safe, locate those processes outside of guards and near existing operators. If it has to be inside a guard, make sure the downtime associated with correcting an issue and restarting is as fast as possible. These are things like feeder bowl jams (don’t forget the inline portion), gravity conveyor jams, dunnage exchanges, etc. While no customer will want to live with their operator fixing a feeder bowl jam every five minutes, if the operator can fix 5-10 stuck parts a shift without losing any production time, that won’t count against you. But for any of those things to actually be beneficial, the closest operator needs to know there’s a problem almost immediately, so lights or buzzers or HMI notifications (or all of the above) are key to gaining that time back.
Keep stations on synchronous systems (dials and precision links) fairly simple. As mentioned above, any downtime stops everything, so complexity is the enemy to throughput. And unless the stations are literally simple pick and places, probes, etc, I would consider breaking up a bunch of stations into two or three smaller dials with buffer conveyors between rather than one giant 30 station precision link, for example.
Downtime for repairs/replacement count in the calculation. Typically, that wouldn’t be a part of the short term runoff that constitutes acceptance on your floor or the customer floor, but it will play into the long term number. Presumably, you aren’t just trying to get buy off and then walk away….you want that customer forever. There is no better salesperson on earth than a line that outperforms all others on the customer’s plant floor. So, any wear tooling or regular preventative maintenance (lubrication, filter replacement, etc) should be easily accessible and if possible, tool-less. Every minute (or even second) counts. If you’re a car race fan, you know that almost all races are won in the pits, not on the track. You should think that way, too.

Don’t Forget the Non-Cyclicals

One thing that is typically not excluded as non-production time is exchange of dunnage, so that counts against you in the OEE calculation. First, figure out who has to do it. Is it a “water spider” person that’s always around, but maybe not immediately there? The operator? A forklift operator that is tending giant sections of the plant? Even layer separators and small parts dunnage handling can create disruptions in flow. Make sure you think those things through and design your station layout and space for dealing with all that stuff that isn’t usually on a designer’s mind and then think about schemes like buffering and utilizing non-value-add time to do the material handling. I cover that in a little more detail in my post on designing by cycle time.

The Wildcard - People

Just like life, where people are involved, things get messy. A person is the ultimate in autonomous, intelligent, collaborative “automation,” but they also have bad days, varying levels of motivation and quality consciousness. And, in some cases, there are production incentives in place that make it advantageous for them to not be as fast while a line is being commissioned as they will ultimately be during production. All that notwithstanding, there is a pretty big learning curve on a manually intensive station. I remember working on an assembly line as a co-op and feeling like I was all thumbs and fighting like mad to keep up with line rate and I looked over at the lady doing the same job I was doing while she was reading a novel! When you look at the “performance” (or efficiency) term in the OEE calculation, that can be further broken down into automation performance and operator performance. We tend to be much better at estimating the times involved for the automation and the variation in actual performance is virtually nonexistent cycle to cycle. Between wildly optimistic estimated times in applications/design and widely varying operator performances, your cycle time risk is skewed considerably towards the manual content.

Why is any of that important? You have to “make rate” on the line for buy off with people that have never seen the line before and may or may not be motivated to have a successful runoff. That can create serious contractual headbutting. Many times, if the automation company’s own staff can run the line at rate (without looking like they are doing a cross-fit training session), that will go a long way towards achieving buy-off, so it definitely pays to practice running the line as if it were in production during your debug. You or others in the company may be asked to “prove it” to the customer.

Final Thoughts

Trying to determine the long term performance of a line during a few hours of runoff is a daunting task. That task is made more difficult by the fact that it’s contractually binding and both “sides” want to make sure that they are not the reason for a failed acceptance run, or worse yet, a poorly running line in production. I have often wondered how much excess capacity, cost, and contract negotiation energy is buried in each piece of capital equipment just to make sure that one event happens successfully, and if there is a better, more stepped approach where both the automation company and the final customer share a little more risk and work together more closely to achieve the goals over the 3-6 month production launch period. But, that is hard to do in a fixed bid, competitive situation, so we all do the best we can to protect our respective companies and maintain a good relationship. While I don’t have any magic bullet answers, the two things that are absolutely key is keeping all this in front of you during the design, build and debug of the equipment and also making sure that there is a clear acceptance plan in place long before the customer shows up for the runoff. None of it happens by accident!

designadviceWilliam BuddeJune 22, 2021Comment