The holiday season brings lots of people to your front door. If you have a front door camera, you may be getting many alerts from your front door that let you know there is motion at the door. It would be great if the front doorbell cameras could take the next step and incorporate #AI facial/image recognition and notify you through #iOS notifications WHO is at the front door and, in some cases, which “uniformed” person is at the door, e.g. FedEx/UPS delivery person.
This facial recognition technology is already baked into Microsoft #OneDrive Photos and Apple #iCloud Photos. It wouldn’t be a huge leap to apply facial and object recognition to catalog the people who come to your front door as well as image recognition for uniforms that they are wearing, e.g., UPS delivery person.
iCloud/OneDrive Photos identify faces in your images, group by likeness, so the owner of the photo gallery can identify this group of faces as Grandma, for example. It may take one extra step for the camera owner to login into the image/video storage service and classify a group of videos converted to stills containing the face of Grandma. Facebook Meta also can tag the faces within pictures you upload and share. The Facebook app also can “guess” faces based on previously uploaded images.
No need to launch the Ring app and see who’s at the front door. Facial recognition can remove the step required to find out what is the motion at the front door and just post the iOS notification with the “who’s there”.
One less step to launching the Ring app and see who is at the front door.
The problem is more widespread then highlighted in the article. It’s not just these high profile companies using “public domain” images to annotate with facial recognition notes and training machine learning (ML) models. Anyone can scan the Internet for images of people, and build a vast library of faces. These faces can then be used to train ML models. In fact, using public domain images from “the Internet” will cut across multiple data sources, not just Flickr, which increases the sample size, and may improve the model.
The rules around the uses of “Public Domain” image licensing may need to be updated, and possibly a simple solution, add a watermark to any images that do not have permission to be used for facial recognition model training. All image processors may be required to include a preprocessor to detect the watermark in the image, and if found, skip the image from being included in the training of models.
about deconstructing existing functionality of entire Photo Archive and Sharing platforms.
to bring an awareness to the masses about corporate decisions to omit the advanced capabilities of cataloguing photos, object recognition, and advanced metadata tagging.
Backstory: The Asks / Needs
Every day my family takes tons of pictures, and the pictures are bulk loaded up to The Cloud using Cloud Storage Services, such as DropBox, OneDrive, Google Photos, or iCloud. A selected set of photos are uploaded to our favourite Social Networking platform (e.g. Facebook, Instagram, Snapchat, and/or Twitter).
Every so often, I will take pause, and create either a Photobook or print out pictures from the last several months. The kids may have a project for school to print out e.g. Family Portrait or just a picture of Mom and the kids. In order to find these photos, I have to manually go through our collection of photographs from our Cloud Storage Services, or identify the photos from our Social Network libraries.
Social Networking Platform Facebook
As far as I can remember the Social Networking platform Facebook has had the ability to tag faces in photos uploaded to the platform. There are restrictions, such as whom you can tag from the privacy side, but the capability still exists. The Facebook platform also automatically identifies faces within photos, i.e. places a box around faces in a photo to make the person tagging capability easier. So, in essence, there is an “intelligent capability” to identify faces in a photo. It seems like the Facebook platform allows you to see “Photos of You”, but what seems to be missing is to search for all photos of Fred Smith, a friend of yours, even if all his photos are public. By design, it sounds fit for the purpose of the networking platform.
Automatically upload new images in bulk or one at a time to a Cloud Storage Service ( with or without Online Printing Capabilities, e.g. Photobooks) and an automated curation process begins.
The Auto Curation process scans photos for:
“Commonly Identifiable Objects”, such as #Car, #Clock, #Fireworks, and #People
Auto Curation of new photos, based on previously tagged objects and faces in newly uploaded photos will be automatically tagged.
Once auto curation runs several times, and people are manually #taged, the auto curation process will “Learn” faces. Any new auto curation process executed should be able to recognize tagged people in new pictures.
Auto Curation process emails / notifies the library owners of the ingestion process results, e.g. Jane Doe and John Smith photographed at Disney World on Date / Time stamp. i.e. Report of executed ingestion, and auto curation process.
After upload, and auto curation process, optionally, it’s time to manually tag people’s faces, and any ‘objects’ which you would like to track, e.g. Car aficionado, #tag vehicle make/model with additional descriptive tags. Using the photo curator function on the Cloud Storage Service can tag any “objects” in the photo using Rectangle or Lasso Select.
Curation to Take Action
Once photo libraries are curated, the library owner(s) can:
Automatically build albums based one or more #tags
Smart Albums automatically update, e.g. after ingestion and Auto Curation. Albums are tag sensitive and update with new pics that contain certain people or objects. The user/ librarian may dictate logic for tags.
Where is this Functionality??
Why are may major companies not implementing facial (and object) recognition? Google and Microsoft seem to have the capability/size of the company to be able to produce the technology.
Is it possible Google and Microsoft are subject to more scrutiny than a Shutterfly? Do privacy concerns at the moment, leave others to become trailblazers in this area?
As a step to pacify all of the mocking around Google Glass, the current Governor of California,Jerry Brown, in conjunction with Arnold Schwarzenegger, as a gag to allude to the Terminator movies, will announce later in the year that a motor unit, or police motorcycle, will use Google Glass with plate and face recognition systems to help them identify and if necessary detain suspects for outstanding warrants. The specific city identified for this program has yet to be specified.
Three paragraphs are extremely interesting, and imply military applications as well as policing their own people.
Kuo said the device will be mounted on a headset with a small LCD screen and will allow users to make image and voice searches as well as conduct facial recognition matches.
“What you are doing with your camera, for example, taking a picture of a celebrity and then checking on our database to see if we have a facial image match, you could do the same thing with a wearable visual device,” Kuo said.
“We haven’t decided whether it is going to be released in any commercial form right now, but we experiment with every kind of technology that is related to search,” Kuo said. Kuo declined to comment on the other functions of the Baidu Eye or whether Baidu is working on other forms of wearable technology.
It implyies that targeted people who are targeted for ‘crimes’ such as civil disobedience, may be tracked in a database. The last paragraph implies that the technology may be targeted for the ‘public’ / government sector use. In addition, all governments may use this technologies at their borders easier recognition of targeted individuals. I could also visualize other highly policed states, where terrorism is very active, to provide these glasses to transportation gatekeepers, such as bus drivers, or train conductor, where at the point of collecting tickets, they may be able to perform retinal recognition, and allow the collection of fees, depending on the accuracy of the technology, as well as identify them for any outstanding warrents for arrest. A person may board a bus, and by identifying the person through facial, retinal, and/or voice recognition, if cleared a security check, the bus driver may ask automatically, would you like this fare deducted from your linked checking, or which credit card, ending in the last for digits.
This technology might eventually be mandated by the states within the EU. That’s a thought, as well as the requirements to connect each border check to cross reference with Interpol, the World Health Organization (WHO) for the spread of possible infectious disease control, as well as local government warrents.
As described in a previous post, see Streaming Video Freelance: Video Affiliate Network channels like Google Plus / YouTube, Facebook, Twitter and Viveo (everything from bar impromptu jams to concert events), I did not mention all the little bits to avoid issues such as all faces of the audience either need to have realtime face bluring technology similar to Facial recognition, or the people in the event must sign the waver, which might not be likely, so the solution will need a delayed streaming to allow blurring facial recognition software to work, and for some televised events, allow compliance to FCC regulations, edit introducing artificial intelligence (AI) word bleep insertion, and object recognition to blur the recognition of exposed body parts. 🙂 The ridiculously amazing advertising introduction: if a person signs up to allow to be seen, they could be picked if they are wearing a hat, sneakers, or certain brand of shirt, then if the AI object recognition picks up the object, a hue can accentuate the item, the viewer can pause the stream, click on the advertiser’s object on a video streaming tablet touch screen, and get a list of local and web distributors, prioritized by advertising, popularity, and rating. Same goes for the red carpet events, and although not advertising dresses, suits, and accessories because of cost, it could enhance the event by again, pause, and get a blurb about the designer, the object and a link to the catalog or portfolio of the designer’s work. Introduces advertising revenue for on sight products. Even the videographers smartphone can have an addressable link to the product used to stream, as well as a small logo overlay which if clicked provides a profile of the videographer, and a brief portfolio of their work, all from a tablet application allowing the licensed distributor(s) a main channel in the center of the screen and small square boxes on the outer edge of the main window like PnP, and a person just taps on of the border boxed streams, and then that box becomes the stream that takes up the majority of the screen. The main window can have the absolute maximum fps for the device and the border windows may have half or a quarter of the allowed frames per second (fps), this way you still get to preview the alternate perspective streams, while watching the main stream. You can also auto hide the boarder preview streams to focus the user on the main stream and give them flexibility of alternate vantage points. The system may allow the videographer to bid for boarder alternate allowing placement of their streams in a particular corner, and frequency AS well as the user can have a ‘favorite’ freelancer videographer stream, and the system would allow for quicklinks to videographer streams the user knows will be at the event.