Blog

Smart city video analytics: Art, science and secret sauce

November 15, 2021 6:57 am


Menashe interview

Menashe interview

 

Menashe Rothschild is the co-founder and Chief Product Officer at viisights. our company provides advanced behavioural recognition systems for real-time video intelligence, which are used by cities, transport hub operators, education campuses, industrial zones and more.

Rothschild shares his insights on how video analytics technology is advancing.

What does your role as Chief Product Officer at viisights entail?

I am in charge of both defining what the product is and executing on that definition.

What drew you to this field, and what excites you in it?

We are different to other vendors since we are analysing scenes that are not ‘sterile’, meaning the system doesn’t make its decision from a single image but from a sequence of images in a sliding window. Most other video analytics products work in sterile environments, but the urban environment is not sterile – there are vehicles, bicycles, scooters, and of course people, all intermixed.

Understanding behaviour in such a ‘noisy’ environment is a challenge that most of the research in this domain is not addressing at the moment.

An additional reason for excitement is that machine learning, and artificial intelligence in general, is still not in the engineering phase and to succeed in this domain, there is an element of art. I enjoy this aspect and that we are doing something quite unique in the market.

One factor is that we carry out the inference based on machine learning in real-time.

With our focus on behavioural patterns, there is no formula or template for what we are doing because it is cutting-edge. It’s part of my excitement and challenge in life that our work requires continuous innovation and creativity.

How do you approach the development of viisights?

We are constantly looking at three dimensions for inspiration. One is, of course, what cities require. Sometimes they can be limited by their vision and their perceptions of the limitations of technology. I say to city executives: ‘Forget any constraints. Imagine you have unlimited money and resources: what would you like to have?’

They often come up with something that they didn’t ask for before, not because they don’t need it but because they are blocked by what traditional analytics companies are offering in the market.

The second dimension we look at is state-of-the-art research on machine learning. We are always looking to learn and potentially leverage the best in new algorithms and methodologies.

The third area is advances in hardware development. Whatever is happening on the server today will eventually be processed on the camera itself.

How are you using these sources of inspiration to enhance the viisights product?

There are two major types of improvement. One is functional improvements where we add new types of analytics. We are already experts in detecting violence, for example. Now, we are in the process of enhancing our software to detect people leaving behind an object, such as a bag or suitcase.

We are adding new features every few months.

Behind the scenes, we are also modifying the underlying technology and architecture and leveraging new types of algorithms or frameworks to improve the product. This enables new capabilities and increases accuracy, which of course is a major aspect of what our customers expect from us.

What are some examples of recent improvements?

Our system has been very adept at detecting short events for some time – events of between five seconds and 30 seconds, such as violence breaking out.

Now we are adding capabilities for longer events that are measured in minutes, rather than seconds. Loitering near an ATM is an example.

We are also investing a lot in enhancing the system’s capabilities even when there is a relatively small amount of data. One of our challenges is that machine learning is based on having a lot of data: you need a lot of examples.

For violence, we do have a lot of examples but when it comes to people leaving behind bags or objects in train stations or airports, we don’t have a lot of examples that show the characteristics of someone doing this in a suspicious way.

However, we still can train the machine learning to leverage this small amount of data and provide accurate inference results.

How do you do that?

What I can say is that we have developed a unique architecture that enables us to get good results with a lower amount of examples.

I can’t say more because that is our secret sauce.

Blog Posts