20 19 retail changes! 3D vision dark warfare intelligent container

Smart things (WeChat official account: zhidxcom)

Wen | Ji Yusheng

After the crazy battle of 20 17 and the sharp fall of 20 18, the intelligent container finally ushered in the ultimate form of the industry-3D dynamic intelligent container!

What can 3D dynamic smart containers bring to this industry? Besides high space utilization rate, accurate commodity identification efficiency, low calculation amount, mature technology and perfect industrial chain construction, who is the giant to carve up this cake closest to users?

With these problems, in the past few weeks, after interviewing dozens of industry leaders and domain experts, Zhidong found that the current 3D dynamic visual container is on the eve of a large-scale outbreak, and a commercial competition about technology and point preemption is coming 20 19.

First, after upgrading four times a year, the ultimate form of smart containers appears.

Standing at the moment when 3D dynamic containers appeared, we will find that after more than a year of development, unmanned containers have already left the grassroots stage, and refined operation and role division are becoming the general trend of the current industry.

At this stage, there are intelligent container platform vendors represented by Ali, container operators represented by Daily Fresh, and overall container manufacturers represented by small selling cabinets, which further expand to the upstream of the industrial chain. We will find that 3D camera manufacturers represented by Tuyang Technology are ready, and 3D dynamic submodule providers represented by Shenshi Technology have been waiting for a long time.

From a technical point of view, you will find that everything is developing rapidly and silently. In just over a year, the product scheme has undergone four generations of upgrading.

The first prototype of the evolution from unmanned shelves to intelligent containers probably took place in the beginning of 20 17. In April of that year, the "CITYBOX" smart container was launched, mainly using RFID radio frequency tags for automatic deduction.

In this mode of operation, every commodity will be labeled with an RFID tag at a cost of about 50 cents, and then each floor of the container will be equipped with corresponding sensors at a price of about 1000 yuan, and all commodities will be captured by the sensors.

However, it didn't take long for the RFID scheme to be quickly eliminated by the market because the user might tear off the label, resulting in the theft of goods and high deployment and operation costs. Some people in the industry laughed and said that RFID solution containers are all for label factories.

Then, at the beginning of 20 18, the machine vision smart container represented by atypical smart container player Deep Blue Technology came out, which brought the smart container market into the "camera" era.

During this period, a camera will be placed at the top and center of each floor, or a camera will be placed on the left and right sides of each floor, and then the algorithm will complete the deduction according to the changes of goods on each floor before and after the container is opened recorded by the camera on each floor.

However, this scheme requires the camera to completely capture the changes of each layer of goods, so it is required that the goods cannot be stacked and there should be a large gap between the camera and the goods. Therefore, although this scheme ensures security, it causes a great waste of space.

In addition, once the number of SKUs needs to be increased or decreased, the static scheme needs to constantly adjust the training position of a single SKU to cope with various possible pick-and-place situations, so the whole scheme will be over-fitted and the SKU category will be limited. At present, the mainstream static solution application in the market still stays at the stage of selling typical standard products such as beverages.

What can make up for this defect is the dynamic vision scheme. On March 20 18, Yitong made its debut at the "China Retail Digital Innovation Conference". Similarly, computer vision is also used to identify goods. The dynamic scheme uses four cameras at the door to identify the goods in the user's hands after opening the door, so there is almost no requirement for the placement of goods inside the container, and the number of cameras inside the smart container is also reduced.

However, unlike static recognition, which can upload all data to the cloud and then identify it, dynamic recognition needs to identify every pixel in continuous multi-frame images, which requires a lot of calculation and positioning deployment. Specifically, the traditional dynamic scheme usually requires the camera to be equipped with 720 pixels at a speed of 60 frames per second. The most common 1070 graphics card needed for calculation costs about 5000 yuan, and it also needs a series of configurations such as motherboard, CPU, memory and packaging. Finally, after deploying a system, the cost of a single cabinet will increase by nearly ten thousand yuan.

In order to reduce the impact of the cost and background of localization deployment on the recognition efficiency, a 3D dynamic scheme officially appeared at the beginning of 20 19.

The main difference between the 3D dynamic scheme and the traditional dynamic scheme is that the 3D camera is introduced for positioning. In the original 2D camera capture, the items in the user's hands can be positioned at the pixel level from the spatial perspective, and then the irrelevant background is erased, and only the goods in a specific area are identified, thus achieving the purpose of reducing the calculation amount and thus reducing the cost.

Two, three heads come into play, and the 20 19 war is imminent.

On the eve of the outbreak of 3D dynamic vision container, the first product was "Jim series dynamic vision intelligent container" released by shoppers in mid-February, 2008.

According to insiders, at present, this product of the small-selling cabinet has not actually entered the stage of large-scale promotion, but has been mass-produced on a small scale, and this product can also be seen in some exhibitions.

From a technical point of view, this product is mainly developed using Intel OpenVINO AI Toolkit. Based on 3D+2D dynamic visual recognition and gravity sensing, the container can hold 240 items stacked, and the settlement accuracy can reach 99%. Real-time interaction and recognition can be realized whether the user takes it with one hand, two hands or many times.

In terms of hardware, Jim series dynamic vision intelligent container adopts low-power edge computing equipment to accelerate the model derivation, which can complete the settlement locally at the moment when the user closes the door, greatly shortening the settlement time and broadband cost of the user's shopping.

In terms of power consumption, this product has a capacity of 5 10 liter, and is also equipped with a 2 1.5-inch LED screen, which can display users' products and pricing in real time, but the power consumption is only 3 degrees/day.

In addition to the overall cabinet manufacturers selling small cabinets, Ali is also actively exploring this aspect from the perspective of platform vendors.

From 20 16 Double Eleven, Ali New Retail Intelligent Business Group started the project of intelligent containers, and by the end of 20 18, it officially started the exploration of 3D dynamic intelligent containers.

It is reported that Ali contacted the products of three solution providers before and after laying this 3D dynamic solution, among which accuracy, price and user experience are all important considerations.

However, according to Ali insiders, it will take some time to test and optimize the equipment before it is officially rolled out. At present, a small amount of equipment in Alibaba Xixi Park has started testing. It is estimated that around June 165438+ 10 this year, Ali will roll out the equipment on a large scale.

As for the improvement of the daily quality of container operators, it is reported that the 3D dynamic vision scheme has been explored as early as April of 20 18, and has been put into trial operation at some points.

Third, to meet the market explosion, 3D vision algorithm providers are in place.

"If the three-dimensional solution can't run out, don't be a smart container." Asked about the development of smart containers in the next few years, Zhang Lei, CEO of Shenshi Technology, told Zhi Zhi.

Shenshi Technology is a computer 3D vision algorithm provider. Its core founders all graduated from Peking University Electronics Department, and have more than ten years of working experience in the fields of chips, algorithms and computer vision, as well as a number of patents in related industries.

As early as mid-20 17, when the wave of unmanned shelves was just emerging, Zhang Lei and two other core founders set their eyes on the smart container of 3D dynamic vision scheme.

They are responsible for the sub-module of 3D dynamic vision system in containers, which is simply the research and development of commodity identification algorithm and the corresponding hardware procurement configuration in 3D dynamic solutions.

In his view, the mainstream smart container solutions on the market are more or less fatal. The three advantages brought by 3D vision scheme, such as high container space utilization rate, low localization deployment cost and high commodity identification accuracy, can just solve the shortcomings of some previous industry schemes.

When he made up his mind to do this work, he first set a principle that the scheme should be universal and efficient.

Previously, some 3D dynamic vision schemes on the market were mostly similar to Microsoft's 3D vision game Kinect, but this scheme could only run on X86 platform at that time. If it is used in large-scale industry, the cost will be fatal.

Therefore, how to optimize the algorithm on the ARM platform according to the hardware characteristics has become a top priority. After solving this problem, the cost problem plummeted. Zhang Lei said that at present, a set of system encapsulating CPU, GPU and memory of Shenshi Technology only costs about 2,000 yuan, which is only about one third of the 2D power scheme.

After solving the technical problems, how to turn a demo into a generally stable solution in the industry is also a problem that cannot be ignored.

The simplest, such as the layout of four 2D cameras, most people may think that the top two cameras are facing down and the bottom two cameras are facing up, so that the user's behavior can be captured completely and clearly.

However, after practical application, it is found that this scheme simply doesn't work, and girls wearing short skirts in summer is enough to make this scheme very embarrassing. After discussion, we finally decided that four 2D cameras, two at the top, two in the middle and one 3D camera in the middle of the top, were all photographed.

Even the position of 3D camera is a problem that is repeatedly discussed and optimized. At first, because there will be some blind spots in the 3D camera, people will put the position of the 3D camera higher, but after this deployment, the camera will not be able to capture the user himself, which will have a certain impact on the recognition efficiency.

In addition, sometimes there will be a problem that users hold multiple commodities in one hand, which will have a certain impact on the identification efficiency of commodities.

I thought it was a complicated problem to optimize through various hardware upgrades or algorithms, but in the end, by adding a transparent baffle at the top of the cabinet and the bottom of each shelf, users could neither take out the goods in the blind area nor take out too many goods at once. An industry tycoon who came to visit once said with a smile that you can apply for a patent for this board.

After solving this series of problems, the current system of Shenshi Technology has also completed small-scale deployment and internal testing in some mainstream cabinet manufacturers.

Fourth, to meet the market outbreak, 3D cameras ushered in the era of smart container customization.

Fang Tuyang Technology, a 3D camera provider of Shenshi Technology, turned its attention to the consumer field almost on 20 17.

In the view of Fei Zheping, CEO of Tuyang Technology, the application of industrial 3D cameras has gradually matured, which is enough to support the stable profitability of enterprises. The retail industry, which also has an urgent demand for 3D vision, is still in the blue ocean stage.

Although there is little difference in the underlying hardware technology between industrial field and retail field, it still takes a long time to adjust camera parameters and lens configuration for different industries.

During the year from 20 17' s decision to enter the consumption field to 20 18' s gradual emergence of demand, Fei Zheping mainly focused on product polishing for specific consumption fields.

The first is the choice of technology. At present, there are mainly the following types of 3D camera solutions in the market: TOF, RGB binocular and structured light.

Among the three mainstream schemes, structured light and TOF are more mature. Among them, the structured light scheme is the most mature, but it is easily disturbed by external light, with slow response and low recognition accuracy. TOF has some advantages over structured light scheme in these aspects, so TOF has become a promising scheme on the mobile side. The binocular stereo imaging scheme based on parallax principle has strong anti-interference ability and high resolution, and it is also one of the optional schemes for mobile terminals. At present, the page with pure binocular scheme has the disadvantage that it can't find matching points in monotonous texture environment and becomes invalid.

Different from the traditional scheme on the market, Tuyang adopts the active binocular vision scheme, and the 3D vision sensor consists of a binocular infrared camera, a color camera and an optical enhancement system, that is, the binocular scheme is integrated with the structured light scheme.

Among them, the optical enhancement system, also known as structured light in the industry, is essentially a laser projector, and the binocular camera is equivalent to a receiver. When the light projected by the projector shines on the surface of the object, the object reflects the light to two cameras to collect the corresponding parameter information, and then gives the physical properties of the object such as length, width, height and distance through the graphic matching algorithm. It can overcome the shortcomings of the above scheme in accuracy and efficiency.

In addition to technical problems, there are also problems of industrial customization to be dealt with.

Generally speaking, the price of cameras used in the consumer field is only at the middle and lower reaches level, because it does not require high accuracy for long-distance identification. However, in this field, there are higher requirements for the blind area, viewing angle and speed of hardware.

In terms of frame rate, the average frame rate of 3D cameras is 30 frames per second, but in smart containers, it takes 60 frames to support users' quick access and playback. In terms of the size of the visual blind spot, the mainstream solutions on the market at present all have large blind spots, and generally data can be obtained outside 50 cm. However, in smart containers, this parameter must be shortened to below 20 or even 15 cm to prevent users from stealing goods from the blind spot. The matching lens angle also needs to be extended from 60 degrees to 90 degrees or 100 degrees.

It seems simple, but it involves lens replacement, sensor parameter debugging and reprint capacity input. If the market demand is not predicted in advance or there is not enough technical support, these needs of customers can not be met.

After completing the demo, how to ensure the application in real situations still needs a long road of stress testing. For example, the most basic questions: what behavior is normal, what behavior is illegal, what abnormal consumption behavior will occur under real circumstances, and what requirements need to be put forward for the parameters of the corresponding hardware products, all of which need time to verify.

However, obtaining consumer behavior data is not what general hardware manufacturers are good at. Therefore, with the support of the head customers, * * * has become a hurdle that must be crossed. Fei Zheping said that Tuyang has reached in-depth cooperation with a number of head 3D dynamic container manufacturers.

Verb (abbreviation of verb) Future Possibility and Current Limitation of 3D Smart Container

Why do you want to make smart containers? Different identities will have different considerations. For brand stores, this may bring more container sales. For the daily freshness of operators, this may lead to a doubling of operational efficiency and a rapid decline in costs. For Ali, as an important way to explore new retail, this may create an offline Tmall for him.

At this stage, the explosive growth of domestic e-commerce will hardly come again, and most of the remaining users are settled in rural and offline fields. In rural areas, both the speed of expansion and the extent of growth are very limited, and offline is almost an uncultivated virgin land. If used properly, it is likely to bring explosive growth.

Simply calculate an account, if the operator lays 20,000 sets of equipment and the daily order volume of a single equipment is only 15, then the daily order volume can reach 300,000. Pinduoduo, a upstart e-commerce, has been established for two years, and its daily orders are only 300,000-400,000. For Ali, this is almost a re-creation of an offline Tmall.

Smart containers with deep offline scenes can also accomplish some things that online Tmall and Pinduoduo can't. For example, the natural advertising display attributes can narrow the distance with users.

Take Youbao Online as an example, which is a traditional vending machine manufacturer. Its financial report shows that in the first half of 20 18, Youbao operated about 55,000 devices online, with operating income of11.41000 billion yuan and net profit of 86,048,500 yuan, of which advertising income reached 265,438.

If calculated by the density of more than 5 million receivers in Japan, the current market is far from saturated. If vending machines and smart container screens are used in China, it is not difficult to build a Focus Media.

The future may be numerous, but there are still problems to be solved at present, from technology formation to market maturity, involving the interaction and cooperation of the whole industrial chain.

Although the application of 3D dynamic identification technology in smart containers has basically taken shape at this stage, how to ensure the operating efficiency under real conditions and the capacity supply of other supporting hardware needs to be gradually improved through continuous optimization.

On the one hand, from the technical indicators, how to reduce the recognition errors of users taking three or more commodities with one hand needs some improvement. This kind of improvement for all kinds of emergencies also needs feedback from container operators in real situations.

In terms of efficiency, although the training speed of 3D dynamic recognition has been greatly improved compared with the traditional static recognition SKU, in the face of large-scale SKU shelves, the training sample demand of 2000 single items still needs to be considered from the aspects of algorithm, computing power, cost and time consumption.

In terms of supporting facilities, although the current 3D vision technology is relatively mature, the microgravity sensing equipment used for inspection and verification has not been customized for the retail industry, which also has a certain impact on the large-scale launch time of the equipment.

However, when the technology and model have been formed, the rest is left to time.

Conclusion: The final shape has been set, will the market reproduce the 20 17 point war?

From the rise to the present, unmanned containers have experienced a roller coaster for two years. In the first year, the grass grows and the warblers fly, and the capital and the war climax again and again. In the second year, the players retreated wildly and changed their forms four times a year.

When the ultimate form is basically fixed in the 3D dynamic scheme and the technology has taken shape, where will the second half of the smart container go? Will the 20 17 spot war reappear?

Perhaps the formation of technology will give this industry a shot in the arm, but the market will never be so simple. The form is just the tip of the iceberg exposed in this market. The battle of supply chain below sea level, the battle of payment entrance, the battle of supplier grabbing ... everything has yet to be determined.

Nevertheless, technological progress has brought infinite possibilities for the development of this industry. In the chorus that the smart container is dead and there is no future for unmanned retail, 3D dynamic vision is pushing the smart container to the eve of another outbreak.