Questions tagged [apple-vision]

Apple Vision is a high-level computer vision framework to identify faces, detect and track features, and classify images, video, tabular data, audio and motion sensors data.

Apple Vision framework performs face and face landmark detection on input images and video, barcode recognition, image registration, text detection, and, of course, feature tracking. Vision API allows the use of custom CoreML models for tasks like classification or object detection.

145 questions
87
votes
2 answers

iOS revert camera projection

I'm trying to estimate my device position related to a QR code in space. I'm using ARKit and the Vision framework, both introduced in iOS11, but the answer to this question probably doesn't depend on them. With the Vision framework, I'm able to get…
Guig
  • 8,612
  • 5
  • 47
  • 99
57
votes
8 answers

Converting a Vision VNTextObservation to a String

I'm looking through the Apple's Vision API documentation and I see a couple of classes that relate to text detection in UIImages: 1) class VNDetectTextRectanglesRequest 2) class VNTextObservation It looks like they can detect characters, but I don't…
Adrian
  • 14,925
  • 16
  • 92
  • 163
23
votes
3 answers

Apple Vision framework – Text extraction from image

I am using Vision framework for iOS 11 to detect text on image. The texts are getting detected successfully, but how we can get the detected text?
Abhishek
  • 4,857
  • 5
  • 20
  • 25
11
votes
3 answers

Classify faces from VNFaceObservation

I'm working with Vision framework to detect faces and objects on multiple images and works fantastic. But I have a question that I can't find on documentation. The Photos app on iOS classify faces and you can click on face and show all the images…
mhergon
  • 1,488
  • 1
  • 16
  • 35
10
votes
3 answers

Apple Vision image recognition

As many other developers, I have plunged myself into Apple's new ARKit technology. It's great. For a specific project however, I would like to be able to recognise (real-life) images in the scene, to either project something on it (just like…
9
votes
3 answers

ARKit and Vision frameworks for Object Recognition

I would really like some guidance on combining Apple's new Vision API with ARKit in a way that enables object recognition. This would not need to track the moving object, just recognize it stable in 3d space for the AR experience to react…
cnzac
  • 415
  • 2
  • 12
8
votes
3 answers

Convert VNRectangleObservation points to other coordinate system

I need to convert the VNRectangleObservation received CGPoints (bottomLeft, bottomRight, topLeft, topRight) to another coordinate system (e.g. a view's coordinate on screen). I define a request: // Rectangle Request let…
mihaicris
  • 151
  • 2
  • 8
8
votes
2 answers

VNTrackRectangleRequest internal error

I'm trying to get a simple rectangle tracking controller going, and I can get rectangle detection going just fine, but the tracking request always ends up failing for a reason I can't quite find. Sometimes the tracking request will fire it's…
Andy Heard
  • 1,462
  • 14
  • 24
8
votes
3 answers

Vision Framework Barcode detection for iOS 11

I've been implementing a test of the new Vision framework which Apple introduced in WWDC2017. I am specifically looking at the barcode detection - I've been able to get after scanning the image from Camera/Gallery that it's a barcode image or not.…
Hitesh Arora
  • 81
  • 1
  • 4
7
votes
2 answers

How can I tell which languages are available for text recognition in Apple's Vision framework?

I'm trying to add the option to my app to allow for different languages when using Apple's Vision framework for recognising text. There seems to be a function for programmatically returning the supported languages but I'm not sure if I'm calling it…
7
votes
1 answer

Apple Vision – Can't recognize a single number as region

I want to use VNDetectTextRectanglesRequest from a Vision framework to detect regions in an image containing only one character, number '9', with the white background. I'm using following code to do this: private func performTextDetection() { …
AndrzejZ
  • 215
  • 2
  • 8
6
votes
0 answers

How to convert BoundingBox from VNRequest to CVPixelBuffer Coordinate

I try to crop a CVImageBuffer (from AVCaptureOutput) using the boundingBox of detected face from Vision (VNRequest). When I draw over the AVCaptureVideoPreviewLayer using : let origin = previewLayer.layerPointConverted(fromCaptureDevicePoint:…
Alak
  • 1,187
  • 2
  • 10
  • 17
6
votes
1 answer

ARKit – sceneView renders its content at 120 fps (but I need 30 fps)

I'm developing ARKit app along with Vision/AVKit frameworks. My app recognizes hand gestures ("Victory", "Okey", "Fist" gestures) for controlling a video. So I'm using MLModel for classification of my hand gestures. App works fine but view's content…
Andy Fedoroff
  • 26,838
  • 8
  • 85
  • 144
6
votes
1 answer

ARKit & Vision frameworks – Detecting wall edges

I wonder is it theoretically possible to detect wall edges/lines (like in the picture)? All I could achieve is detecting the vertices of rectangles that are visible to Camera Preview. But we can't consider real walls as rectangles. So, is there…
arturdev
  • 10,228
  • 2
  • 34
  • 62
6
votes
1 answer

How can I take a photo of a detected rectangle in Apple Vision framework

How can I take a photo (get an CIImage) from the successful VNRectangleObservation object? I have a video capture session running and in func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection:…
denis631
  • 1,596
  • 1
  • 14
  • 35
1
2 3
9 10