Skip to main content

Overview

For completed file split tasks, if the category of split child files is incorrect or you need to adjust the page range of child files, you can use the amend category API to modify the category and page numbers of split files.
This API is used to modify the category and page numbers of child files generated by file split tasks (task_type = 2, parent task). You need to first obtain the parent task’s task_id.

Use Cases

  1. Split Category Correction: After automatic splitting, some child files have incorrect category recognition and need manual correction
  2. Page Range Adjustment: The page range of split files needs adjustment, such as merging or re-dividing multiple child files

API Endpoint

Endpoint: POST /api/app-api/sip/platform/v2/file/amend_category

Request Parameters

ParameterTypeRequiredDescription
workspace_idstringYesWorkspace ID
task_idstringYesParent task ID (file split task ID)
split_tasksarrayYesFile split task list, each element contains category and pages

split_tasks Parameter Description

ParameterTypeRequiredDescription
categorystringYesChild task file category
pagesarrayYesChild file page number array, starting from 0

Parameter Description

  • task_id: Parent task ID (task_type = 2), can be obtained through the file/fetch API
  • category: New file category name, must be a file category already configured in the DocFlow workspace. If a child file doesn’t need category modification, you can keep the original category unchanged
  • pages: Page number array indicating the original file pages contained in this child file. For example, [0, 1] means pages 1 and 2 (starting from 0). If a child file doesn’t need page number modification, you can keep the original page numbers unchanged
Important: The split_tasks array must contain all split child file information, even if some child files don’t need category or page number modifications. If only partial child file information is submitted, the unlisted child files will be deleted or cause processing exceptions.

Example Code

# Important: Must include all split child file information
# Assuming the original file is split into 3 child files, even if you only need to modify the first child file's category,
# you must include information for all 3 child files
curl -X POST \
  -H "x-ti-app-id: <your-app-id>" \
  -H "x-ti-secret-code: <your-secret-code>" \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "1234567890",
    "task_id": "1234567890",
    "split_tasks": [
      {
        "category": "Electronic Invoice (Regular)",
        "pages": [0, 1]
      },
      {
        "category": "Contract",
        "pages": [2, 3, 4]
      },
      {
        "category": "Receipt",
        "pages": [5, 6]
      }
    ]
  }' \
  "https://docflow.textin.com/api/app-api/sip/platform/v2/file/amend_category"

Get Parent Task ID and Child File Information

Before modifying the file category, you need to obtain the parent task’s task_id and child file information. You can query it through the file/fetch API:
curl \
  -H "x-ti-app-id: <your-app-id>" \
  -H "x-ti-secret-code: <your-secret-code>" \
  "https://docflow.textin.com/api/app-api/sip/platform/v2/file/fetch?workspace_id=<your-workspace-id>&file_id=<your-file-id>"

Response

After successfully modifying the file category, the API returns a success response:
{
  "code": 200,
  "msg": "success"
}

Complete Example

The following is a complete example showing how to query file split task information and then modify child file categories and page numbers:
Python
import requests
import json

def get_split_task_info(workspace_id, file_id, app_id, secret_code):
    """Get file split task information"""
    url = "https://docflow.textin.com/api/app-api/sip/platform/v2/file/fetch"
    headers = {
        "x-ti-app-id": app_id,
        "x-ti-secret-code": secret_code
    }
    params = {"workspace_id": workspace_id, "file_id": file_id}
    response = requests.get(url, headers=headers, params=params)
    return response.json()

def amend_split_category(workspace_id, task_id, split_tasks, app_id, secret_code):
    """Amend category and page numbers for file split tasks"""
    url = "https://docflow.textin.com/api/app-api/sip/platform/v2/file/amend_category"
    headers = {
        "x-ti-app-id": app_id,
        "x-ti-secret-code": secret_code,
        "Content-Type": "application/json"
    }
    payload = {
        "workspace_id": workspace_id,
        "task_id": task_id,
        "split_tasks": split_tasks
    }
    response = requests.post(url, headers=headers, json=payload)
    return response.json()

# Usage example
WORKSPACE_ID = "1234567890"
FILE_ID = "202412190001"
APP_ID = "<your-app-id>"
SECRET_CODE = "<your-secret-code>"

# 1. Get file split task information
file_info = get_split_task_info(WORKSPACE_ID, FILE_ID, APP_ID, SECRET_CODE)
files = file_info.get("result", {}).get("files", [])
if files:
    file_data = files[0]
    parent_task_id = file_data.get("task_id")
    task_type = file_data.get("task_type")
    child_files = file_data.get("child_files", [])
    
    # Confirm it's a file split parent task (task_type = 2)
    if task_type == 2:
        print(f"Parent task ID: {parent_task_id}")
        print("Current child file information:")
        
        # Build split_tasks parameter
        # Important: Must include all child file information, even if some don't need modification
        split_tasks = []
        for child in child_files:
            if child.get("task_type") == 0:  # Child files generated by file split
                # Get current page number information
                pages_info = child.get("pages", [])
                if isinstance(pages_info, list) and pages_info:
                    # If pages is an object array, extract page numbers
                    pages = [p.get("page") if isinstance(p, dict) else p for p in pages_info]
                elif isinstance(pages_info, dict):
                    # If pages is a dictionary, try to extract pages array
                    pages = pages_info.get("pages", [])
                else:
                    # If pages is a simple array, use directly
                    pages = pages_info if isinstance(pages_info, list) else []
                
                current_category = child.get("category")
                print(f"  - Category: {current_category}, Pages: {pages}")
                
                # Example: Only modify the first child file's category, keep others unchanged
                # Note: Must include all child files, even if they don't need modification
                if len(split_tasks) == 0:
                    # Modify the first child file's category
                    split_tasks.append({
                        "category": "Electronic Invoice (Regular)",  # New category
                        "pages": pages  # Keep original page numbers
                    })
                else:
                    # Other child files keep original category and page numbers
                    split_tasks.append({
                        "category": current_category,  # Keep original category
                        "pages": pages  # Keep original page numbers
                    })
        
        # 2. Amend file category and page numbers
        # Ensure all child file information is included
        if split_tasks:
            print(f"\nPreparing to submit information for {len(split_tasks)} child files")
            result = amend_split_category(WORKSPACE_ID, parent_task_id, split_tasks, APP_ID, SECRET_CODE)
            print(f"Amendment result: {json.dumps(result, indent=2, ensure_ascii=False)}")
    else:
        print(f"This task is not a file split parent task (task_type={task_type})")

Page Number Notes

  • Page numbers start from 0, meaning page 1 corresponds to page number 0, page 2 corresponds to page number 1, and so on
  • The pages array indicates the original file page numbers contained in this child file

Notes

  1. Must Include All Child Files: The split_tasks array must contain all split child file information, even if some child files don’t need category or page number modifications. If only partial child file information is submitted, the unlisted child files will be deleted or cause processing exceptions
  2. Task Type Restriction: Only file split parent tasks (task_type = 2) support using the split_tasks parameter
  3. Category Must Exist: The specified category must already be configured in the DocFlow workspace, otherwise an error will be returned
  4. Category Name Matching: Category names must exactly match the configuration (case-sensitive)
  5. Page Range: Ensure page numbers in the pages array are within valid range (0 to total pages - 1), and page numbers cannot be duplicated
  6. No Duplicate Pages: Each page number can only appear in one child file, no overlapping allowed
  7. Reprocessing After Modification: After modifying file categories and page numbers, the system will reprocess data according to the new categories and page ranges