Module Lunchify
Package jku.se

Class CloudOCRService

java.lang.Object
jku.se.CloudOCRService

public class CloudOCRService extends Object
Service class for performing OCR (Optical Character Recognition) using Google Cloud Vision API. Extracts structured data such as date, amount, and category from an image of a receipt or invoice.
  • Constructor Details

    • CloudOCRService

      public CloudOCRService()
  • Method Details

    • analyzeImage

      public CloudOCRService.OCRResult analyzeImage(File imageFile) throws IOException
      Analyzes an image using Google Cloud Vision API and extracts structured information.
      Parameters:
      imageFile - The image file to analyze.
      Returns:
      OCRResult containing date, amount, and category.
      Throws:
      IOException - if the request fails.
    • encodeImageToBase64

      public String encodeImageToBase64(File imageFile) throws IOException
      Encodes an image file to a Base64 string.
      Parameters:
      imageFile - The file to encode.
      Returns:
      Base64-encoded image string.
      Throws:
      IOException - if reading the file fails.
    • buildRequestJson

      public String buildRequestJson(String base64Image)
      Builds the JSON request body for the OCR API.
      Parameters:
      base64Image - Base64-encoded image string.
      Returns:
      JSON request as a string.
    • extractTextFromJson

      public String extractTextFromJson(String json)
      Parses the OCR API response JSON and extracts the full detected text.
      Parameters:
      json - JSON response from the API.
      Returns:
      Extracted plain text.
    • extractDate

      public String extractDate(String text)
      Extracts a date string from raw text using several regex formats.
      Parameters:
      text - Raw OCR text.
      Returns:
      Extracted date string or "Not found".
    • extractAmount

      public String extractAmount(String text)
      Extracts the amount from OCR text, prioritizing labeled values (e.g., "Amount").
      Parameters:
      text - OCR-detected text.
      Returns:
      Extracted amount string or "Not found".
    • detectCategory

      public String detectCategory(String text)
      Attempts to detect whether the receipt belongs to a restaurant or supermarket.
      Parameters:
      text - OCR-detected text.
      Returns:
      Category label: "RESTAURANT", "SUPERMARKET", or "OTHER".