Part 2 | Practices on Android Real-Time Screen Sharing and MediaCodec Hardware Codec Optimization
Android's hardware media codec API (MediaCodec) is historically known for device fragmentation and integration complexities. In remote support configurations, an Android device must function both as a controlled client (sender) compressing high-resolution frames, and a helper client (receiver) decoding H.264 video streams on the fly.
Achieving low latency on both ends requires resolving system-wide frame capture permissions, handling layout resizing without causing tearing, and building a resilient recovery mechanism for hardware codec failures.
In this deep dive, we explore how the Easy Connect Suite integrates MediaProjection, OpenGL ES, and MediaCodec to create a stable, low-latency video pipeline.
1. Capturing the Screen: MediaProjection and Foreground Services
Starting with Android 10 (API 29), Google enforced strict permission guidelines for capturing display contents. To perform screen capture, the app must bind its MediaProjection API to a running Foreground Service flagged as type mediaProjection.
Setup Flow:
- Permission Request: Call
MediaProjectionManager.createScreenCaptureIntent()to trigger the OS projection consent prompt. The user's approval returns a tokenIntent. - Service Initialization: Immediately call
startForegroundServiceto launch our background capture service. - Manifest Declaration: Register the service in
AndroidManifest.xmlwith the appropriate type:xml<service android:name=".AssistScreenCaptureService" android:foregroundServiceType="mediaProjection" /> - Projection Acquisition: Retrieve the
MediaProjectioninstance using the approved token intent, which acts as our system-wide display capture source.
2. Zero CPU Copying: MediaCodec Surface Mode
Traditional Android capture pipelines fetch screen buffers using an ImageReader. This pulls the raw frame from the GPU to CPU memory, converts it to YUV bytes, and pushes it back to the encoder. For 1080p resolutions, this memory copy loop causes heavy CPU loads, high heat generation, and dropped frames.
Easy Connect Suite implements a Surface Input Mode to keep data within the GPU:
[ Screen Capture (MediaProjection) ]
│
├─ (Renders directly to Surface)
▼
[ Encoder Input Surface ] (Allocated via createInputSurface())
│
▼ (Hardware encoder processes buffer in GPU RAM)
[ MediaCodec AVC Encoder ] (COLOR_FormatSurface Mode)- Encoder Configuration: Initialize the video encoder (
video/avc) and configure its input color format toMediaCodecInfo.CodecCapabilities.COLOR_FormatSurface. - Input Surface Creation: Call
codec.createInputSurface()to get a GPU-backed drawing surface managed directly by the hardware codec. - Virtual Display Setup: Create a
VirtualDisplayusingmediaProjection.createVirtualDisplay(...), binding its output directly to the encoder's input surface.
With this setup, the OS compositor (SurfaceFlinger) writes the screen buffer directly to the hardware encoder's GPU texture. The CPU handles only instruction routing, achieving zero-copy hardware encoding.
3. Smooth Transitions: OpenGL ES Orientation and Scaling Rotator
When a user rotates their Android phone between portrait and landscape during a support session, the screen resolution shifts (e.g., from 1080x2400 to 2400x1080). Because a MediaCodec compression session cannot change its input surface dimensions dynamically, this change causes stretched video frames or crashes the receiver's decoder.
To handle this, we route frames through our AssistCapturePortraitLandscapeRotator (an OpenGL ES frame converter):
[ MediaProjection Source ] ──► [ OES External Texture (GPU) ]
│
▼ (OpenGL Render Loop)
[ Matrix Matrix Transform: Scale & Rotate ]
│
▼ (EGLSurface Swap)
[ Encoder Input Surface (Fixed size) ]- Intermediate Texture: Instead of sending display buffers directly to the encoder surface, we render them to an OpenGL OES External Texture (
SamplerExternalOES). - EGL Context Setup: Run a background thread to manage a dedicated EGL Context, mapping the encoder's input surface as an
EGLSurface. - Projection Matrix Transform: When device rotation is detected, the thread updates the orthographic projection matrix:
- Proportional Scaling: Scales the frame to match the target encoder resolution, centering the image and adding black bars to prevent stretching.
- GPU Rotation: Applies a rotation matrix in the vertex shader to rotate the frame on the GPU.
- Drawing: Call
eglSwapBuffersto write the processed frame to the encoder's input surface, outputting a stable aspect-ratio video stream.
4. Decoding and Rendering: MediaCodec Pipelines and Hot-Resets
When the Android client functions as a helper (receiver), it parses the incoming H.264 payload using a MediaCodec decoder. We implement three alternative decoding pathways:
- Direct Surface Render (Zero-Copy): The decoder outputs directly to a
Surfacemanaged by a UI view (SurfaceVieworTextureView). The GPU composites the frame directly, delivering low latency. - ImageReader Callback Mode: The decoder writes to an
ImageReaderconfigured withCOLOR_FormatFormatSurface. The app fetches the buffer as aHardwareBufferto render touch markers or generate UI snapshots. - ByteBuffer Legacy Path: Standard software-based extraction used as a fallback for older Android devices.
Codec Recovery via Hot-Resets
On weak networks, packet loss can corrupt incoming frames, causing screen tearing, visual artifacts, or thread locks in the decoder.
To manage this, we monitor decoder errors and latency thresholds to trigger a Hot-Reset:
- Error Detection: If the decoder throws a
MediaCodec.CodecExceptionor frame processing latency spikes, the app triggersrequiresMediaCodecDecoderHardReset. - In-Place Reset: Without rebuilding the UI rendering views, the app runs
codec.flush()andcodec.stop(), re-instantiating the decoder. - Sync Request: The client sends a fast signal (
encoder_sync) to the sender, requesting an immediate H.264 IDR keyframe to restore the display.
This Android pipeline handles mobile network shifts and device fragmentation, providing a stable remote control experience. In the next part of this series, we will cover desktop screen capture and encoder optimization on Windows, macOS, and Linux.
