UIImagePickerController provides a straight way to take a picture. It supports all the basic features, such as choosing source of camera (front, back), tapping on an area to lock focus and exposure and very simple editing features.
However, when direct access to the camera is necessary, the AVFoundation framework gives full control, for example, for changing the hardware parameters programmatically, or manipulating the live preview.
So, AVFoundation, along with a couple of other frameworks, is the gateway to the camera, the microphone, and the multimedia support in iOS. AVFoundation allows us to combine inputs (such as camera or microphone) and outputs (such as UIImage
of a a video file) which we can then use for any purpose.
Here are the main classes of AVFoundation framework:
AVCaptureSession
. An object that manages capture activity and coordinates the flow of data from input devices to capture outputs. To perform real-time capture, you instantiate an AVCaptureSession
object and add appropriate inputs and outputs. When session is ready, we need to provide a source of data. For this purpose we create an instance of AVCaptureDevice
.AVCaptureDevice
. It is the interface to the hardware camera and is used to control the hardware features such as the position of the lens, the exposure, and the flash.AVCaptureDeviceInput
. It provides the data coming from the device. AVCaptureOutput
. It is an abstract class describing the result of a capture session. There are concrete subclasses to capture a still image or capture the raw frames for a live preview.AVCaptureVideoPreviewLayer
. The preview layer is just a CALayer you can create from a capture session, and add it as a sublayer into your view. All it does is present to you the video that is running through the capture session.Let's look at AVFoundation capture process in general. You can think of it as a pipeline from hardware to software. You have a central AVCaptureSession
that has inputs and outputs. It mediates the data between the two. Your inputs come from AVCaptureDevice
which is software representations of the different audio/visual hardware components of an iOS device. The AVCaptureOutput
extract data from whatever is feeding into the the capture session.
Now, let’s do some code to capture the photo using AVCaptureSession
.
Before setting up the capture session, add the camera permission key Privacy — Camera Usage Description into the plist file. Make sure it is required to access the camera.
<key>NSCameraUsageDescription</key> <string>Accessing your camera to take photo.</string> <key>NSPhotoLibraryUsageDescription</key> <string>We are accessing your photos.</string>
Here is example of manual photo capturing and saving to the Gallery or recognizing a text on the photo (just uncomment self?.recognizeImage(image!)
).
import UIKit import AVFoundation import Vision class AVTestViewController: UIViewController { let session = AVCaptureSession() let output = AVCapturePhotoOutput() var previewLayer = AVCaptureVideoPreviewLayer() let avQueue = DispatchQueue(label: "AVQueue", qos: .userInitiated) let shutterButton: UIButton = { let v = UIButton(frame: CGRect(x: 0, y: 0, width: 100, height: 100)) v.layer.cornerRadius = 50 v.layer.borderWidth = 10 v.layer.borderColor = UIColor.white.cgColor return v }() override open var preferredInterfaceOrientationForPresentation: UIInterfaceOrientation { return .portrait } override var shouldAutorotate: Bool { return true } override var prefersStatusBarHidden: Bool { return true } override func viewDidLoad() { super.viewDidLoad() view.layer.addSublayer(previewLayer) view.addSubview(shutterButton) shutterButton.addTarget(self, action: #selector(handleTap), for: .touchUpInside) checkPermission() } override func viewDidLayoutSubviews() { super.viewDidLayoutSubviews() shutterButton.center = CGPoint(x: view.frame.size.width / 2, y: view.frame.size.height - 100) updatePreviewLayerFrame() } func setupCamera() { if let device = getDevice() { do { session.sessionPreset = .photo let input = try AVCaptureDeviceInput(device: device) if session.canAddInput(input) { session.addInput(input) } if session.canAddOutput(output) { session.addOutput(output) } previewLayer.videoGravity = .resizeAspectFill previewLayer.session = session updatePreviewLayerFrame() avQueue.async { [weak self] in self?.session.startRunning() } } catch { print(error.localizedDescription) } } } func getDevice() -> AVCaptureDevice? { let anyDevice = AVCaptureDevice.default(for: .video) // if let device = AVCaptureDevice.default(.builtInWideAngleCamera, for: .video, position: .back) { // anyDevice = device // } else { // fatalError("no back camera") // } // // //get front camera // if let device = AVCaptureDevice.default(.builtInWideAngleCamera, for: .video, position: .front) { // anyDevice = device // } else { // fatalError("no front camera") // } return anyDevice } func checkPermission() { switch AVCaptureDevice.authorizationStatus(for: .video) { case .notDetermined: AVCaptureDevice.requestAccess(for: .video) { granted in guard granted else { return } DispatchQueue.main.async { [weak self] in self?.setupCamera() } } case .restricted: break case .denied: break case .authorized: setupCamera() @unknown default: break } } @objc func handleTap() { let myShotOrientation = UIDevice.current.orientation.asCaptureVideoOrientation if let photoOutputConnection = output.connection(with: .video) { photoOutputConnection.videoOrientation = myShotOrientation } let photoSettings = AVCapturePhotoSettings() //photoSettings.isHighResolutionPhotoEnabled = true photoSettings.flashMode = .auto output.capturePhoto(with: photoSettings, delegate: self) let generator = UIImpactFeedbackGenerator(style: .light) generator.impactOccurred() } private func updatePreviewLayerFrame() { let orientation = UIDevice.current.orientation if let connection = previewLayer.connection, let videoOrientation = AVCaptureVideoOrientation(rawValue: orientation.rawValue) { previewLayer.frame = view.bounds connection.videoOrientation = videoOrientation previewLayer.removeAllAnimations() } } } extension AVTestViewController: AVCapturePhotoCaptureDelegate { func photoOutput(_ output: AVCapturePhotoOutput, didFinishProcessingPhoto photo: AVCapturePhoto, error: Error?) { session.stopRunning() guard let data = photo.fileDataRepresentation() else { return } let image = UIImage(data: data) let imageView = UIImageView(image: image) imageView.contentMode = .scaleAspectFill imageView.frame = view.bounds DispatchQueue.main.async { [weak self] in self?.view.addSubview(imageView) // Save to the Gallery UIImageWriteToSavedPhotosAlbum(image!, nil, nil, nil) // Recognize text //self?.recognizeImage(image!) } } func recognizeImage(_ image: UIImage) { let request = VNRecognizeTextRequest { (request, error) in guard let observations = request.results as? [VNRecognizedTextObservation] else { print("Error recognizing text: \(String(describing: error))") return } for observation in observations { guard let topCandidate = observation.topCandidates(1).first else { continue } print("Recognized text: \(topCandidate.string)") } } guard let cgImage = image.cgImage else { return } let handler = VNImageRequestHandler(cgImage: cgImage, options: [:]) do { try handler.perform([request]) } catch { print("Error performing Vision request: \(error)") } } } extension UIDeviceOrientation { var asCaptureVideoOrientation: AVCaptureVideoOrientation { switch self { case .landscapeLeft: return .landscapeRight case .landscapeRight: return .landscapeLeft case .portraitUpsideDown: return .portraitUpsideDown default: return .portrait } } }