Welcome to the new Box Support website. Check out all the details here on what’s changed.

Search Parameters in Box

New post

Comments

2 comments

  • France

    Hi Mark,

    Welcome to the Box Community!

    Have you checked out this KB article on Searching for Files, Folders, and Content?
     

    Search Indexing

    Box has a secure index for content much like the index in the back of a textbook. Every time a file or folder is changed, we add those words to the index in a process called indexing. When you conduct a search, we look in the search index for files and folders that match your query. When content is added, updated, or deleted in Box, we will update the search index accordingly.

    • Search Availability: It can take time between uploading or modifying a file for it to be fully indexed and ready to be searched. In most cases, you can expect newly added or changed files to be available via Box search in 10 minutes. The current service load determines the index time and it may take more than 10 minutes in some cases.
    • Search Access: Only content that you have access to (if you can preview and/or view) will display in search. Simply put, if you don’t have access to a file or folder within your account, it won’t show up in your search results.  Files that you recently accessed via shared link can also appear in your search results.  In addition, if you search within trash, the results display only content for which you have previewer/uploader permission or above.
    • Prefix Matching and Wildcard Search: Trailing wildcards (also known as prefix matching) are implicitly included in search results because of the way text is indexed. Searching for Bo results in items with titles containing Box or Boat or Boxer. It is the equivalent of searching for Bo* or Bo% in traditional search engines. Traditional wildcard notation is not supported by Box, such as %ox%. While we support prefix matching on titles, we do not support prefix matching on body content, suffix matching in the title or body content, or infix (middle of the word) matching in the title or body content. For example, a search on “cal” would match results for a file named “California” but not “decal” or “recall”. It would not match results with file body contents of prefixes, infixes, or suffixes including “California”, “recall”, or “decal”.
    • Stemming: Box Search uses stemming to match terms from the query to terms in the index. Because of this, words that include the same stem may be included in the result set, even if the words do not contain the exact form in the query.  For example, “run” and “running” map to the same stem, so a search on “running” may return a document containing “run” in the title.
    • File Content Searching: The content within your files is also stored within the Box search index. The following file types support file content search: ‘boxnote’, 'csv', 'doc', 'docx', 'gdoc', 'gsheet', ‘gslide’, ‘gslides’, 'htm', 'html', 'msg', 'odp', 'odt', 'ods', 'pdf', 'ppt', 'pptx', 'rtf', 'tsv', 'wpd', 'xhtml', 'xls', 'xlsm', 'xlsx', 'xml', 'xsd', 'xsl', 'as', 'as3', 'asm', 'bat', 'c', 'cc', 'cmake', 'cpp', 'cs', 'css', 'cxx', 'diff', 'erb', 'groovy', 'h', 'haml', 'hh', 'java', 'js', ‘json’, 'less', ‘log’, 'm', 'make', ‘md’, 'ml', 'mm', 'php', 'pl', 'plist', 'properties', 'py', 'rb', ‘rst’, 'sass', 'scala', 'script', 'scm', 'sml', 'sql', 'sh', 'txt', 'vi', 'vim', 'webdoc', ’yaml’
    • Indexed Text per Document: The Box search index stores up to 10,000 bytes (~10,000 characters in English) per document for accounts from Business level and above. This amount can vary from document to document because of language, Box’s indexing method, and document type.
    This may be relevant to your problem with the next steps!
     
    0
    Comment actions Permalink
  • Mark Haworth

    Yes I had.  thank you for some extra clarity on the function though.

    The problem breaks down to this:  I know several files (ex file1, file2, file3) have a particular string in it; lets say "12345abcde".  File1 and file2 I can edit with excel online because i'm a collaborator on them, but file3 I would have to download and upload to make any changes.  If I do a search in box for "12345abcde" it will (sometimes) return results for file1 and file2, but not file3.  Other times it will return results in file1, file2, and file3.

    I was wondering essentially, if there was some quirk of the file format that would make it not return a result for that search string because of editing parameters, or if i needed to talk to the uploaders and get them to change something in their file type for search results to return the correct information.

     

    thanks

    0
    Comment actions Permalink

Please sign in to leave a comment.