The Embedded GPU Opportunity

The Embedded GPU Opportunity

Peter Debenham - Senior Consultant, Signal Processing

By: Peter Debenham
Senior Consultant, Signal Processing

28th June 2017

The CPU (Central Processing Unit) has often been called the brains of a PC. But increasingly, the CPU is being enhanced, even supplanted, by another component in the PC, the GPU (Graphics Processing Unit). So what does this mean? Well, in layman’s terms, the PC’s brain has evolved.

Introducing GPU-accelerated computing, in recent years, GPUs have moved from being highly specialised for graphics processing to being high-speed general computing systems. But how can GPU processing be implemented and what could it achieve?

To address this, we have to look at three of the most common requirements for computing:

1. Significant processing capability

2. Rapid/low-cost development

3. Low Size, Weight, and Power (SWaP) solution

To fully realise the opportunity to deliver all three of these requirements with GPU processing, we first have to look at the traditional approaches that paved the way before us.

The traditional approach – you can only meet two of the above requirements

Traditional technologies and skillsets can only cater for any two of the three above requirements and will often follow the below scenarios:

Rapid development and low-SWaP:

  Incorporate embedded microprocessors and controllers. This is usually written in “C”, often with a Real-Time Operating System (RTOS).

  Downside: Maximum processing capability of such microprocessors is limited.

Significant processing and Rapid development:

 – Use a conventional PC, maybe with GPU acceleration. You could be writing low-level code in “C” but just as likely use mathematical processing systems, such as Mathcad or Octave with their vast number of pre-written libraries.

  Downside: Certainly not low-SWaP.

Significant processing and Low SWaP:

  Create a bespoke field-programmable gate array (FPGA) to handle the job. Many FPGA System On Chip (SOC) devices (e.g. Xilinx Zynq) are very efficient in terms of power consumption.

  Downside: Notoriously slow development processes and hard to debug.

The embedded GPU alternative

GPU-accelerated computing was pioneered in 2007 by computing technology company, NVidia. By using a GPU based SOC with CUDA (NVidia’s general purpose parallel computing environment) or OpenCL (an alternative open standard for cross-platform parallel programming), we have the possibility of providing significant processing capability, rapid development, and a low-SWaP solution.

Offering high-performance computing similar to upper-end PC grade microprocessors and FPGAs, development of the most recent embedded GPUs has also changed. Now offering power consumption rates in the FPGA range, the embedded GPU meets all three requirements.

In particular with CUDA, the quality of the development tools and libraries made available from NVidia are making GPUs an extremely viable solution for exploring the art of the possible.

To summarise, some of the most exciting developments in computing in the last twenty years has been the growth of highly parallel processing in GPUs. Traditionally, this has concentrated on providing the maximum amount of processing capability for supercomputers or driving high-end gaming graphics.

However, with the arrival of embedded GPUs, we have the ability to provide solutions and meet requirements in areas where traditional methods would otherwise be unable to address. Long live the GPU.

For more information about GPU Computing, you can read Peter’s white paper on the subject here.

Save

Save

Save

Save

Save

Save

The CPU (Central Processing Unit) has often been called the brains of a PC. But increasingly, the CPU is being enhanced, even supplanted, by another component in the PC, the GPU (Graphics Processing Unit). So what does this mean? Well, in layman’s terms, the PC’s brain has evolved.

Introducing GPU-accelerated computing, in recent years, GPUs have moved from being highly specialised for graphics processing to being high-speed general computing systems. But how can GPU processing be implemented and what could it achieve?

To address this, we have to look at three of the most common requirements for computing:

1. Significant processing capability

2. Rapid/low-cost development

3. Low Size, Weight, and Power (SWaP) solution

To fully realise the opportunity to deliver all three of these requirements with GPU processing, we first have to look at the traditional approaches that paved the way before us.

The traditional approach – you can only meet two of the above requirements

Traditional technologies and skillsets can only cater for any two of the three above requirements and will often follow the below scenarios:

Rapid development and low-SWaP:

Incorporate embedded microprocessors and controllers. This is usually written in “C”, often with a Real-Time Operating System (RTOS).

Downside: Maximum processing capability of such micro-processes is limited.

Significant processing and Rapid development:

  Use a conventional PC, maybe with GPU acceleration. You could be writing low-level code in “C” but just as likely use mathematical processing systems such as Mathcad or Octave with their vast number of pre-written libraries.

  Downside: Certainly not low-SWaP.

Significant processing and Low SWaP:

  Create a bespoke field-programmable gate array (FPGA) to handle the job. Many FPGA System On Chip (SOC) devices (e.g. Xilinx Zynq) are very efficient in terms of power consumption.

  Downside: Notoriously slow development processes and hard to debug.

The embedded GPU alternative

GPU-accelerated computing was pioneered in 2007 by computing technology company, NVidia. By using a GPU based SOC with CUDA (NVidia’s general purpose parallel computing environment) or OpenCL (an alternative open standard for cross-platform parallel programming), we have the possibility of providing significant processing capability, rapid development, and a low-SWaP solution.

Offering high-performance computing similar to upper-end PC grade microprocessors and FPGAs, development of the most recent embedded GPUs has also changed. Now offering power consumption rates in the FPGA range, the embedded GPU meets all three requirements.

In particular with CUDA, the quality of the development tools and libraries made available from NVidia are making GPUs an extremely viable solution for exploring the art of the possible.

To summarise, some of the most exciting developments in computing in the last twenty years has been the growth of highly parallel processing in GPUs. Traditionally, this has concentrated on providing the maximum amount of processing capability for supercomputers or driving high-end gaming graphics.

However, with the arrival of embedded GPUs, we have the ability to provide solutions and meet requirements in areas where traditional methods would otherwise be unable to address. Long live the GPU.

For more information about GPU Computing, you can read Peter’s white paper on the subject here.

Save

Save

Save

Save

Save

Save

The dreaded security bugs… call the Yocto Exterminator!

The dreaded security bugs… call the Yocto Exterminator!

Alan Levy - Lead Consultant, Embedded Systems

By: Alan Levy
Lead Consultant, Embedded Systems

17th May 2017

In modern electronic product development, many of the key features tend to be realised in the software or firmware stage. These are the processes that turn a lump of metal, plastic and silicon into a useful device.

My speciality is embedded software development and my role is focussed on embedded operating systems, which more often than not these days means Linux. As a result, the security of embedded Linux systems is a key issue and one that I personally spend a lot of time thinking about.

It would seem reasonable to me that a customer should be able to expect two things from the embedded software in the product they have just bought. Firstly, that it has been developed by engineers who know what they are doing, and secondly, that it should be adequately bug free and secure. In other words, fit for purpose and right from the outset.

In reality, the time taken, in any product development project, to reach key milestones is crucial and any way of speeding up the process always seems attractive to project managers. By contrast, any professional software engineer will tell you that taking shortcuts to speed up development tends to go hand in hand with an increase in both bugs and security risks.

Even if your software appears to be solid on day one, bugs do emerge over time and security flaws are revealed. Suddenly, that top of the range consumer product you put on the market several months ago becomes the target of choice for every hacker on the planet.

At that point (and if you’re sensible, long before then), your top priority is to decide what to do about it.

A perfectly acceptable reaction is to say “just fix the bugs and update the devices as quickly as possible”. Sadly, in practise, a secure remote software update is a very complicated proposition and therefore it is done infrequently or, sometimes, not at all. Worse still, software update mechanisms can themselves introduce reliability and security holes if they are not properly designed and implemented. Homebrew update schemes also have a track record of creating, rather than solving, such problems.

What’s in an update?

Historically, embedded products have typically been updated by downloading the entire system image from a website and reprogramming everything every time. This has significant costs in terms of communications bandwidth, device downtime and difficulty of recovery from update errors.

But a full image update is not the only approach. Ideally, your update scheme needs:

• A simple, secure, modular and automated update mechanism so that the customer doesn’t need to worry about it.

• A way to track the statuses of numerous third party packages that make up a Linux distribution and a method that works out how to apply any changes to your software.

• An approach that minimises the costs involved in obtaining patches and bug fixes for third party packages. These patches and bug fixes will then need to be integrated into your software and validated.

At this point, you are probably wondering how to deal with this. Well, the answer is not to invent your own scheme but to use the solutions that other people have spent years developing and fine-tuning.

This is where ‘Yocto’ enters the picture. Yocto is a software toolset designed to build highly customised, small embedded Linux distributions for resource limited hardware. The name ‘Yocto’ was actually chosen because it is the naming prefix for the smallest measurement scale (10-24) in the SI system of units.

Yocto is based on layered, configurable scripts known as “recipes” that are “baked” to generate the software. In my opinion, Yocto has a number of advantages from a security perspective when compared with similar toolsets such as ‘Buildroot’. In particular, it is simple to use, extremely flexible, and it offers embedded Linux developers access to a massive base of open-source Linux software packages. That includes a number of reliable, secure, tried and trusted package-based update mechanisms.

What’s in a package?

• Every application, system utility and service (including the Linux kernel) is a package. Just as with Linux on desktops and servers, you can update individual packages rather than the entire system image. This minimises the time and effort to rebuild/re-test the software as well as the bandwidth and time required to update the devices in the field.

• Package maintenance is done by the teams that develop the corresponding packages. Package updates are also regularly integrated into the Yocto project and then made available to product developers. This means that all of the details of maintaining and securing packages and recipes are taken care of by people who understand them. All a product developer then has to do is use the latest recipes from the public Yocto Git repositories.

• Devices can also be updated using standard Linux package managers, such as ‘RPM’ (itself just another updateable package). These package managers support cryptographic signing and secure updates. As a result, if an update fails for any reason, the target system is still up and running. You also get the opportunity to try again instead of presenting the end user with a useless electronic brick.

Of course, Yocto won’t solve all of your problems but it does give you a robust starting point. When it comes to product development, it can save you months of development effort, making the project manager happy and putting you on the front foot. Most importantly, it might save you from that embarrassing security bug that would otherwise haunt you for the rest of your career.

For more information about Yocto, you can read Alan’s white paper on the subject here.

In modern electronic product development, many of the key features tend to be realised in the software or firmware stage. These are the processes that turn a lump of metal, plastic and silicon into a useful device.

My speciality is embedded software development and my role is focussed on embedded operating systems, which more often than not these days means Linux. As a result, the security of embedded Linux systems is a key issue and one that I personally spend a lot of time thinking about.

It would seem reasonable to me that a customer should be able to expect two things from the embedded software in the product they have just bought. Firstly, that it has been developed by engineers who know what they are doing, and secondly, that it should be adequately bug free and secure. In other words, fit for purpose and right from the outset.

In reality, the time taken, in any product development project, to reach key milestones is crucial and any way of speeding up the process always seems attractive to project managers. By contrast, any professional software engineer will tell you that taking shortcuts to speed up development tends to go hand in hand with an increase in both bugs and security risks.

Even if your software appears to be solid on day one, bugs do emerge over time and security flaws are revealed. Suddenly, that top of the range consumer product you put on the market several months ago becomes the target of choice for every hacker on the planet.

At that point (and if you’re sensible, long before then), your top priority is to decide what to do about it.

A perfectly acceptable reaction is to say “just fix the bugs and update the devices as quickly as possible”. Sadly, in practise, a secure remote software update is a very complicated proposition and therefore it is done infrequently or, sometimes, not at all.Worse still, software update mechanisms can themselves introduce reliability and security holes if they are not properly designed and implemented. Homebrew update schemes also have a track record of creating, rather than solving, such problems.

What’s in an update?

Historically, embedded products have typically been updated by downloading the entire system image from a website and reprogramming everything every time. This has significant costs in terms of communications bandwidth, device downtime and difficulty of recovery from update errors.

But a full image update is not the only approach. Ideally, your update scheme needs:

• A simple, secure, modular and automated update mechanism so that the customer doesn’t need to worry about it.

• A way to track the statuses of numerous third party packages that make up a Linux distribution and a method that works out how to apply any changes to your software.

• An approach that minimises the costs involved in obtaining patches and bug fixes for third party packages. These patches and bug fixes will then need to be integrated into your software and validated.

At this point, you are probably wondering how to deal with this. Well, the answer is not to invent your own scheme but to use the solutions that other people have spent years developing and fine-tuning.

This is where ‘Yocto’ enters the picture. Yocto is a software toolset designed to build highly customised, small embedded Linux distributions for resource limited hardware. The name ‘Yocto’ was actually chosen because it is the naming prefix for the smallest measurement scale (10-24) in the SI system of units.

Yocto is based on layered, configurable scripts known as “recipes” that are “baked” to generate the software. In my opinion, Yocto has a number of advantages from a security perspective when compared with similar toolsets such as ‘Buildroot’. In particular, it is simple to use, extremely flexible, and it offers embedded Linux developers access to a massive base of open-source Linux software packages. That includes a number of reliable, secure, tried and trusted package-based update mechanisms.

What’s in a package?

• Every application, system utility and service (including the Linux kernel) is a package. Just as with Linux on desktops and servers, you can update individual packages rather than the entire system image. This minimises the time and effort to rebuild/re-test the software as well as the bandwidth and time required to update the devices in the field.

• Package maintenance is done by the teams that develop the corresponding packages. Package updates are also regularly integrated into the Yocto project and then made available to product developers. This means that all of the details of maintaining and securing packages and recipes are taken care of by people who understand them. All a product developer then has to do is use the latest recipes from the public Yocto Git repositories.

• Devices can also be updated using standard Linux package managers, such as ‘RPM’ (itself just another updateable package). These package managers support cryptographic signing and secure updates. As a result, if an update fails for any reason, the target system is still up and running. You also get the opportunity to try again instead of presenting the end user with a useless electronic brick.

Of course, Yocto won’t solve all of your problems but it does give you a robust starting point. When it comes to product development, it can save you months of development effort, making the project manager happy and putting you on the front foot. Most importantly, it might save you from that embarrassing security bug that would otherwise haunt you for the rest of your career.

For more information about Yocto, you can read Alan’s white paper on the subject here.

Save

Blackbirds, Ambassadors and the 7 Layer Model

Blackbirds, Ambassadors and the 7 Layer Model

Peter Massam - Principal Consultant, Signal Processing

By: Peter Massam
Principal Consultant, Signal Processing

10th May 2017

The world is full of information; its inhabitants never stop sending and receiving data. From the dogs barking at night to the blackbirds singing in the garden, both are transmitting signals. And, over time, we, as human beings, have developed more and more sophisticated ways of moving information.

Blackbird_500It’s easy enough to pass information to the person in front of you. All you have to do is just speak the same language. Over the years, our culture has evolved a way of passing information as sounds that have a mutually agreed meaning. But that is not all; this very same culture of ours has developed a range of other cues to help in this communication. A look of puzzlement, a shrug or a nod – they all mean something.

Suddenly, this simple example starts to look quite complicated, so what about the more complex cases?

Well, how do you get information from A to B?

One part of the answer is by using ‘protocols’, but what is a ‘protocol’?

The online Oxford English Dictionary has four definitions for the word ‘protocol’.

Top of the list is “The official procedure or system of rules governing affairs of state or diplomatic occasions”. At the bottom of the list is “A set of rules governing the exchange or transmission of data between devices”.

Presented_to_King_George_IIIIn truth, the two meanings are very similar and this can be demonstrated with the following example. The prime minister of country A calls in the ambassador for country B and hands her a formal letter expressing displeasure at the behaviour of her country’s snooker fans.

The ambassador passes the letter to her secretary, together with her personal notes on the meeting. The secretary writes up the notes and gives the letter and the notes to the diplomatic courier who flies to country B. Once there, they hand them to a deputy under-secretary at the foreign office.

Eventually, the foreign minister receives them, replaces the ambassador’s observations with his own and hands both to the prime minister of country B.

A key element in this system is localised ignorance. The prime ministers are communicating, but do not care which secretary or under-secretary was involved. They deal respectively with the ambassador and the foreign minister, who in turn deal with their secretaries, who deal with the under-secretaries and so on and so forth.

The diplomatic courier does not know or care what is in the bag.

This breaks up the problem of exchanging messages across continents into steps. Each step can then be handled by those who specialise in solving that small part of the whole problem.

The designers of electronic communication systems have adopted this well-tried technique of simplifying a big problem by breaking it up into smaller tasks. A common model used to guide the design is the Open Systems Interconnection model (OSI model), also known as the 7 Layer model.

Layer 7 of the model is the “Application Layer”, the highest layer that interfaces to a user’s application. By stretching my earlier example further, this can be compared to the ambassador and the foreign secretary.

Layer 1 is the “Physical Layer”, the lowest layer that transfers the data across the physical medium (e.g. the wire of an Ethernet cable or the plane carrying the diplomatic courier).

Layer7Each of the layers in the model adds some distinct type of functionality to the communication system. Layer 2 is concerned with node-to-node data transfer; whereas, Layer 3 is concerned with communication within a network of multiple nodes. Layer 4 deals with communicating across multiple networks.

A key principle of the model is that each layer communicates with a peer at the other end of a communication link and each layer does this by using the services of the layer below it. It is also important to understand that not all the layers are used in every communications link.

For instance, messages for a printer in a router that have arrived from a PC over a WiFi interface will only reach Layer 3. This is because Layer 3 has the knowledge to route them to the Ethernet interface connected to the printer. In other words, when the courier arrives at the airport in country B, he does not ask the foreign minister how to get to the foreign office – he asks the information desk.

The OSI model is useful for understanding existing protocols and designing new ones, but it must be used with care. There are many different interpretations of how it should be applied because, essentially, there are many different communications scenarios.

Here is one such scenario. A few years ago, Plextek designed a Formula 1 telemetry system that transferred data from the cars to the pit in real time. At the time, the regulations forbad any form of communication from the pit to the car because the authorities were concerned that it would be used to re-configure the car during the race. As a result, the regulation was strictly enforced and a protocol had to be implemented that provided a one-way communication path with no feedback to indicate if data had been successfully transferred to the pit. Frustratingly, a few years later the regulation was relaxed to allow more conventional protocol designs to be used.

Another example of an unconventional protocol is that used in the Telensa street-lighting system. Although bi-directional communications are permitted, the throughput available in each direction differs by a factor of 10. The design for that protocol was also heavily influenced by the regulation governing the ISM bands that are being used.

Despite the different interpretations of the model that exist, particularly in the wireless domain, it remains a useful starting point. When I start analysing an unfamiliar protocol, I invariably look for the similarities between its structure and that of the OSI model.

Of course that’s not the end of it, but perhaps that is a blog for another time…





Image credits: Blackbird: Malene Thyssen

Ambassador: “John Adams 1st American Ambassador to English Court, Presented to King George III

The world is full of information; its inhabitants never stop sending and receiving data. From the dogs barking at night to the blackbirds singing in the garden, both are transmitting signals. And, over time, we, as human beings, have developed more and more sophisticated ways of moving information.

Blackbird_500It’s easy enough to pass information to the person in front of you. All you have to do is just speak the same language. Over the years, our culture has evolved a way of passing information as sounds that have a mutually agreed meaning. But that is not all; this very same culture of ours has developed a range of other cues to help in this communication. A look of puzzlement, a shrug or a nod – they all mean something.

Suddenly, this simple example starts to look quite complicated, so what about the more complex cases?

Well, how do you get information from A to B?

One part of the answer is by using ‘protocols’, but what is a ‘protocol’?

The online Oxford English Dictionary has four definitions for the word ‘protocol’.

Top of the list is “The official procedure or system of rules governing affairs of state or diplomatic occasions”. At the bottom of the list is “A set of rules governing the exchange or transmission of data between devices”.

Presented_to_King_George_IIIIn truth, the two meanings are very similar and this can be demonstrated with the following example. The prime minister of country A calls in the ambassador for country B and hands her a formal letter expressing displeasure at the behaviour of her country’s snooker fans.

The ambassador passes the letter to her secretary, together with her personal notes on the meeting. The secretary writes up the notes and gives the letter and the notes to the diplomatic courier who flies to country B. Once there, they hand them to a deputy under-secretary at the foreign office.

Eventually, the foreign minister receives them, replaces the ambassador’s observations with his own and hands both to the prime minister of country B.

A key element in this system is localised ignorance. The prime ministers are communicating, but do not care which secretary or under-secretary was involved. They deal respectively with the ambassador and the foreign minister, who in turn deal with their secretaries, who deal with the under-secretaries and so on and so forth.

The diplomatic courier does not know or care what is in the bag.

This breaks up the problem of exchanging messages across continents into steps. Each step can then be handled by those who specialise in solving that small part of the whole problem.

The designers of electronic communication systems have adopted this well-tried technique of simplifying a big problem by breaking it up into smaller tasks. A common model used to guide the design is the Open Systems Interconnection model (OSI model), also known as the 7 Layer model.

Layer 7 of the model is the “Application Layer”, the highest layer that interfaces to a user’s application. By stretching my earlier example further, this can be compared to the ambassador and the foreign secretary.

Layer 1 is the “Physical Layer”, the lowest layer that transfers the data across the physical medium (e.g. the wire of an Ethernet cable or the plane carrying the diplomatic courier).

Layer7Each of the layers in the model adds some distinct type of functionality to the communication system. Layer 2 is concerned with node-to-node data transfer; whereas, Layer 3 is concerned with communication within a network of multiple nodes. Layer 4 deals with communicating across multiple networks.

A key principle of the model is that each layer communicates with a peer at the other end of a communication link and each layer does this by using the services of the layer below it. It is also important to understand that not all the layers are used in every communications link.

For instance, messages for a printer in a router that have arrived from a PC over a WiFi interface will only reach Layer 3. This is because Layer 3 has the knowledge to route them to the Ethernet interface connected to the printer. In other words, when the courier arrives at the airport in country B, he does not ask the foreign minister how to get to the foreign office – he asks the information desk.

The OSI model is useful for understanding existing protocols and designing new ones, but it must be used with care. There are many different interpretations of how it should be applied because, essentially, there are many different communications scenarios.

Here is one such scenario. A few years ago, Plextek designed a Formula 1 telemetry system that transferred data from the cars to the pit in real time. At the time, the regulations forbad any form of communication from the pit to the car because the authorities were concerned that it would be used to re-configure the car during the race. As a result, the regulation was strictly enforced and a protocol had to be implemented that provided a one-way communication path with no feedback to indicate if data had been successfully transferred to the pit. Frustratingly, a few years later the regulation was relaxed to allow more conventional protocol designs to be used.

Another example of an unconventional protocol is that used in the Telensa street-lighting system. Although bi-directional communications are permitted, the throughput available in each direction differs by a factor of 10. The design for that protocol was also heavily influenced by the regulation governing the ISM bands that are being used.

Despite the different interpretations of the model that exist, particularly in the wireless domain, it remains a useful starting point. When I start analysing an unfamiliar protocol, I invariably look for the similarities between its structure and that of the OSI model.

Of course that’s not the end of it, but perhaps that is a blog for another time…





Image credits: Blackbird: Malene Thyssen

Ambassador: “John Adams 1st American Ambassador to English Court, Presented to King George III

Save

Save