Skip to content

Part 2 | Practices on Android Real-Time Screen Sharing and MediaCodec Hardware Codec Optimization

Android's hardware media codec API (MediaCodec) is historically known for device fragmentation and integration complexities. In remote support configurations, an Android device must function both as a controlled client (sender) compressing high-resolution frames, and a helper client (receiver) decoding H.264 video streams on the fly.

Achieving low latency on both ends requires resolving system-wide frame capture permissions, handling layout resizing without causing tearing, and building a resilient recovery mechanism for hardware codec failures.

In this deep dive, we explore how the Easy Connect Suite integrates MediaProjection, OpenGL ES, and MediaCodec to create a stable, low-latency video pipeline.


1. Capturing the Screen: MediaProjection and Foreground Services

Starting with Android 10 (API 29), Google enforced strict permission guidelines for capturing display contents. To perform screen capture, the app must bind its MediaProjection API to a running Foreground Service flagged as type mediaProjection.

Setup Flow:

  1. Permission Request: Call MediaProjectionManager.createScreenCaptureIntent() to trigger the OS projection consent prompt. The user's approval returns a token Intent.
  2. Service Initialization: Immediately call startForegroundService to launch our background capture service.
  3. Manifest Declaration: Register the service in AndroidManifest.xml with the appropriate type:
    xml
    <service
        android:name=".AssistScreenCaptureService"
        android:foregroundServiceType="mediaProjection" />
  4. Projection Acquisition: Retrieve the MediaProjection instance using the approved token intent, which acts as our system-wide display capture source.

2. Zero CPU Copying: MediaCodec Surface Mode

Traditional Android capture pipelines fetch screen buffers using an ImageReader. This pulls the raw frame from the GPU to CPU memory, converts it to YUV bytes, and pushes it back to the encoder. For 1080p resolutions, this memory copy loop causes heavy CPU loads, high heat generation, and dropped frames.

Easy Connect Suite implements a Surface Input Mode to keep data within the GPU:

[ Screen Capture (MediaProjection) ]

                ├─ (Renders directly to Surface)

   [ Encoder Input Surface ] (Allocated via createInputSurface())

                ▼ (Hardware encoder processes buffer in GPU RAM)
   [ MediaCodec AVC Encoder ] (COLOR_FormatSurface Mode)
  1. Encoder Configuration: Initialize the video encoder (video/avc) and configure its input color format to MediaCodecInfo.CodecCapabilities.COLOR_FormatSurface.
  2. Input Surface Creation: Call codec.createInputSurface() to get a GPU-backed drawing surface managed directly by the hardware codec.
  3. Virtual Display Setup: Create a VirtualDisplay using mediaProjection.createVirtualDisplay(...), binding its output directly to the encoder's input surface.

With this setup, the OS compositor (SurfaceFlinger) writes the screen buffer directly to the hardware encoder's GPU texture. The CPU handles only instruction routing, achieving zero-copy hardware encoding.


3. Smooth Transitions: OpenGL ES Orientation and Scaling Rotator

When a user rotates their Android phone between portrait and landscape during a support session, the screen resolution shifts (e.g., from 1080x2400 to 2400x1080). Because a MediaCodec compression session cannot change its input surface dimensions dynamically, this change causes stretched video frames or crashes the receiver's decoder.

To handle this, we route frames through our AssistCapturePortraitLandscapeRotator (an OpenGL ES frame converter):

[ MediaProjection Source ] ──► [ OES External Texture (GPU) ]

                                        ▼ (OpenGL Render Loop)
                             [ Matrix Matrix Transform: Scale & Rotate ]

                                        ▼ (EGLSurface Swap)
                             [ Encoder Input Surface (Fixed size) ]
  1. Intermediate Texture: Instead of sending display buffers directly to the encoder surface, we render them to an OpenGL OES External Texture (SamplerExternalOES).
  2. EGL Context Setup: Run a background thread to manage a dedicated EGL Context, mapping the encoder's input surface as an EGLSurface.
  3. Projection Matrix Transform: When device rotation is detected, the thread updates the orthographic projection matrix:
    • Proportional Scaling: Scales the frame to match the target encoder resolution, centering the image and adding black bars to prevent stretching.
    • GPU Rotation: Applies a rotation matrix in the vertex shader to rotate the frame on the GPU.
  4. Drawing: Call eglSwapBuffers to write the processed frame to the encoder's input surface, outputting a stable aspect-ratio video stream.

4. Decoding and Rendering: MediaCodec Pipelines and Hot-Resets

When the Android client functions as a helper (receiver), it parses the incoming H.264 payload using a MediaCodec decoder. We implement three alternative decoding pathways:

  1. Direct Surface Render (Zero-Copy): The decoder outputs directly to a Surface managed by a UI view (SurfaceView or TextureView). The GPU composites the frame directly, delivering low latency.
  2. ImageReader Callback Mode: The decoder writes to an ImageReader configured with COLOR_FormatFormatSurface. The app fetches the buffer as a HardwareBuffer to render touch markers or generate UI snapshots.
  3. ByteBuffer Legacy Path: Standard software-based extraction used as a fallback for older Android devices.

Codec Recovery via Hot-Resets

On weak networks, packet loss can corrupt incoming frames, causing screen tearing, visual artifacts, or thread locks in the decoder.

To manage this, we monitor decoder errors and latency thresholds to trigger a Hot-Reset:

  • Error Detection: If the decoder throws a MediaCodec.CodecException or frame processing latency spikes, the app triggers requiresMediaCodecDecoderHardReset.
  • In-Place Reset: Without rebuilding the UI rendering views, the app runs codec.flush() and codec.stop(), re-instantiating the decoder.
  • Sync Request: The client sends a fast signal (encoder_sync) to the sender, requesting an immediate H.264 IDR keyframe to restore the display.

This Android pipeline handles mobile network shifts and device fragmentation, providing a stable remote control experience. In the next part of this series, we will cover desktop screen capture and encoder optimization on Windows, macOS, and Linux.

Released under the MIT License. Terms | Privacy