Newly availableRecommended for serious text-processing features that must respect international writing systems.

Overview

Intl.Segmenter splits text into graphemes, words, or sentences according to locale-aware rules. It is especially useful for languages where simple whitespace splitting is not enough.

Browser support

Feature Desktop Mobile
Chrome
Edge
Firefox
Safari
Chrome Android
Safari iOS
87
87
125
14.1
87
14.5
Built-in object

The Intl.Segmenter() constructor creates Intl.Segmenter objects.

87
87
125
14.1
87
14.5

The resolvedOptions() method of Intl.Segmenter instances returns a new object with properties reflecting the options computed during initialization of this Segmenter object.

87
87
125
14.1
87
14.5

The segment() method of Intl.Segmenter instances segments a string according to the locale and granularity of this Intl.Segmenter object.

87
87
125
14.1
87
14.5

The Intl.Segmenter.supportedLocalesOf() static method returns an array containing those of the provided locales that are supported in segmentation without having to fall back to the runtime's default locale.

87
87
125
14.1
87
14.5

A Segments object is an iterable collection of the segments of a text string. It is returned by a call to the segment() method of an Intl.Segmenter object.

87
87
125
14.1
87
14.5

The [Symbol.iterator]() method of Segments instances implements the iterable protocol and allows Segments objects to be consumed by most syntaxes expecting iterables, such as the spread syntax and Statements/for...of loops. It returns a segments iterator object that yields data about each segment.

87
87
125
14.1
87
14.5

The containing() method of Segments instances returns an object describing the segment in the string that includes the code unit at the specified index.

87
87
125
14.1
87
14.5
1+Supported (version) Not supported Has note Sub-feature descriptions sourced from MDN Web Docs (CC BY-SA 2.5)

Syntax

JAVASCRIPT
const segmenter = new Intl.Segmenter('ja', { granularity: 'word' });
const segments = [...segmenter.segment('今日は天気がいいです')];
// Each segment: '今日', 'は', '天気', 'が', 'いい', 'です'

Live demo

Intl.Segmenter

Text word or text to split.Japanese etcemptywhite in blockcut word language to use.

PreviewFullscreen

granularity type

Grapheme(writerecord), word(word), sentence(text). 3type.

PreviewFullscreen

emoji. safeall split

Grapheme is emoji also 1character and positive.

PreviewFullscreen

Use cases

  • Word-aware processing

    Segment text for counters, highlights, or analysis in languages that do not rely on spaces between words.

  • Grapheme-safe editing

    Treat user-perceived characters more accurately when cursor movement or counting must respect composed characters.

Cautions

  • Segmentation rules vary by locale, so choose the locale intentionally rather than assuming one universal behavior.
  • It improves tokenization but does not replace full natural-language understanding or grammar-aware parsing.

Accessibility

  • Better segmentation can improve counters, truncation, and editing behavior for multilingual users, reducing broken text experiences.

Powered by web-features