13

A slug on this context is a string that its safe to use as an identifier, on urls or css. For example, if you have this string:

I'd like to eat at McRunchies!

Its slug would be:

i-d-like-to-eat-at-mcrunchies

I want to know whether there's a standard way of building such strings on Drupal (or php functions available from drupal). More precisely, inside a Drupal theme.

Context: I'm modifying a drupal theme so the html of the nodes it generates include their taxonomy terms as css classes on their containing div. Trouble is, some of those terms' names aren't valid css class names. I need to "slugify" them.

I've read that some people simply do this:

str_replace(" ", "-", $term->name)

This isn't really a enough for me. It doesn't replace uppercase letters with downcase, but more importantly, doesn't replace non-ascii characters (like à or é) by their ascii equivalents. It also doesn't remove "separator strings" from begining and end.

Is there a function in drupal 6 (or the php libs) that provides a way to slugify a string, and can be used on a template.php file of a drupal theme?

kikito
  • 48,656
  • 29
  • 134
  • 183

9 Answers9

16

You can use built in Drupal functions to do this.

$string = drupal_clean_css_identifier($string);
$slug = drupal_html_class($string);

functions will do the trick for you.

heshanlk
  • 318
  • 3
  • 9
11

i am a happy Zen theme user, thus i've met this wonderful function that comes with it: zen_id_safe http://api.lullabot.com/zen_id_safe

it does not depend on any other theme function, so you can just copy it to your module or theme and use it. it is a pretty small and simple function, so i will just paste it here for convenience.

function zen_id_safe($string) {
  // Replace with dashes anything that isn't A-Z, numbers, dashes, or underscores.
  return strtolower(preg_replace('/[^a-zA-Z0-9-]+/', '-', $string));
}

Capi Etheriel
  • 3,115
  • 22
  • 41
  • This is nearly what I needed. However, it doesn't transliterate and it doesn't remove separators from the begining. Anyway, thanks for taking the time to answer. – kikito May 20 '10 at 07:31
  • you can add logic for removing separators (note that it is only a requirement for id, since classes can use everything (see http://barney.w3.org/TR/REC-html40/struct/global.html#adef-class and click on cdata-list). as for proper transliteration, see my comment on googletorp's answer. – Capi Etheriel May 20 '10 at 15:36
10

I ended up using the slug function explained here (at the end of the article, you have to click in order to see the source code).

This does what I need and a couple things more, without needing to include external modules and the like.

Pasting the code below for easy future reference:

/**
 * Calculate a slug with a maximum length for a string.
 *
 * @param $string
 *   The string you want to calculate a slug for.
 * @param $length
 *   The maximum length the slug can have.
 * @return
 *   A string representing the slug
 */
function slug($string, $length = -1, $separator = '-') {
  // transliterate
  $string = transliterate($string);
 
  // lowercase
  $string = strtolower($string);
 
  // replace non alphanumeric and non underscore charachters by separator
  $string = preg_replace('/[^a-z0-9]/i', $separator, $string);
 
  // replace multiple occurences of separator by one instance
  $string = preg_replace('/'. preg_quote($separator) .'['. preg_quote($separator) .']*/', $separator, $string);
 
  // cut off to maximum length
  if ($length > -1 && strlen($string) > $length) {
    $string = substr($string, 0, $length);
  }
 
  // remove separator from start and end of string
  $string = preg_replace('/'. preg_quote($separator) .'$/', '', $string);
  $string = preg_replace('/^'. preg_quote($separator) .'/', '', $string);
 
  return $string;
}
 
/**
 * Transliterate a given string.
 *
 * @param $string
 *   The string you want to transliterate.
 * @return
 *   A string representing the transliterated version of the input string.
 */
function transliterate($string) {
  static $charmap;
  if (!$charmap) {
    $charmap = array(
      // Decompositions for Latin-1 Supplement
      chr(195) . chr(128) => 'A', chr(195) . chr(129) => 'A',
      chr(195) . chr(130) => 'A', chr(195) . chr(131) => 'A',
      chr(195) . chr(132) => 'A', chr(195) . chr(133) => 'A',
      chr(195) . chr(135) => 'C', chr(195) . chr(136) => 'E',
      chr(195) . chr(137) => 'E', chr(195) . chr(138) => 'E',
      chr(195) . chr(139) => 'E', chr(195) . chr(140) => 'I',
      chr(195) . chr(141) => 'I', chr(195) . chr(142) => 'I',
      chr(195) . chr(143) => 'I', chr(195) . chr(145) => 'N',
      chr(195) . chr(146) => 'O', chr(195) . chr(147) => 'O',
      chr(195) . chr(148) => 'O', chr(195) . chr(149) => 'O',
      chr(195) . chr(150) => 'O', chr(195) . chr(153) => 'U',
      chr(195) . chr(154) => 'U', chr(195) . chr(155) => 'U',
      chr(195) . chr(156) => 'U', chr(195) . chr(157) => 'Y',
      chr(195) . chr(159) => 's', chr(195) . chr(160) => 'a',
      chr(195) . chr(161) => 'a', chr(195) . chr(162) => 'a',
      chr(195) . chr(163) => 'a', chr(195) . chr(164) => 'a',
      chr(195) . chr(165) => 'a', chr(195) . chr(167) => 'c',
      chr(195) . chr(168) => 'e', chr(195) . chr(169) => 'e',
      chr(195) . chr(170) => 'e', chr(195) . chr(171) => 'e',
      chr(195) . chr(172) => 'i', chr(195) . chr(173) => 'i',
      chr(195) . chr(174) => 'i', chr(195) . chr(175) => 'i',
      chr(195) . chr(177) => 'n', chr(195) . chr(178) => 'o',
      chr(195) . chr(179) => 'o', chr(195) . chr(180) => 'o',
      chr(195) . chr(181) => 'o', chr(195) . chr(182) => 'o',
      chr(195) . chr(182) => 'o', chr(195) . chr(185) => 'u',
      chr(195) . chr(186) => 'u', chr(195) . chr(187) => 'u',
      chr(195) . chr(188) => 'u', chr(195) . chr(189) => 'y',
      chr(195) . chr(191) => 'y',
      // Decompositions for Latin Extended-A
      chr(196) . chr(128) => 'A', chr(196) . chr(129) => 'a',
      chr(196) . chr(130) => 'A', chr(196) . chr(131) => 'a',
      chr(196) . chr(132) => 'A', chr(196) . chr(133) => 'a',
      chr(196) . chr(134) => 'C', chr(196) . chr(135) => 'c',
      chr(196) . chr(136) => 'C', chr(196) . chr(137) => 'c',
      chr(196) . chr(138) => 'C', chr(196) . chr(139) => 'c',
      chr(196) . chr(140) => 'C', chr(196) . chr(141) => 'c',
      chr(196) . chr(142) => 'D', chr(196) . chr(143) => 'd',
      chr(196) . chr(144) => 'D', chr(196) . chr(145) => 'd',
      chr(196) . chr(146) => 'E', chr(196) . chr(147) => 'e',
      chr(196) . chr(148) => 'E', chr(196) . chr(149) => 'e',
      chr(196) . chr(150) => 'E', chr(196) . chr(151) => 'e',
      chr(196) . chr(152) => 'E', chr(196) . chr(153) => 'e',
      chr(196) . chr(154) => 'E', chr(196) . chr(155) => 'e',
      chr(196) . chr(156) => 'G', chr(196) . chr(157) => 'g',
      chr(196) . chr(158) => 'G', chr(196) . chr(159) => 'g',
      chr(196) . chr(160) => 'G', chr(196) . chr(161) => 'g',
      chr(196) . chr(162) => 'G', chr(196) . chr(163) => 'g',
      chr(196) . chr(164) => 'H', chr(196) . chr(165) => 'h',
      chr(196) . chr(166) => 'H', chr(196) . chr(167) => 'h',
      chr(196) . chr(168) => 'I', chr(196) . chr(169) => 'i',
      chr(196) . chr(170) => 'I', chr(196) . chr(171) => 'i',
      chr(196) . chr(172) => 'I', chr(196) . chr(173) => 'i',
      chr(196) . chr(174) => 'I', chr(196) . chr(175) => 'i',
      chr(196) . chr(176) => 'I', chr(196) . chr(177) => 'i',
      chr(196) . chr(178) => 'IJ', chr(196) . chr(179) => 'ij',
      chr(196) . chr(180) => 'J', chr(196) . chr(181) => 'j',
      chr(196) . chr(182) => 'K', chr(196) . chr(183) => 'k',
      chr(196) . chr(184) => 'k', chr(196) . chr(185) => 'L',
      chr(196) . chr(186) => 'l', chr(196) . chr(187) => 'L',
      chr(196) . chr(188) => 'l', chr(196) . chr(189) => 'L',
      chr(196) . chr(190) => 'l', chr(196) . chr(191) => 'L',
      chr(197) . chr(128) => 'l', chr(197) . chr(129) => 'L',
      chr(197) . chr(130) => 'l', chr(197) . chr(131) => 'N',
      chr(197) . chr(132) => 'n', chr(197) . chr(133) => 'N',
      chr(197) . chr(134) => 'n', chr(197) . chr(135) => 'N',
      chr(197) . chr(136) => 'n', chr(197) . chr(137) => 'N',
      chr(197) . chr(138) => 'n', chr(197) . chr(139) => 'N',
      chr(197) . chr(140) => 'O', chr(197) . chr(141) => 'o',
      chr(197) . chr(142) => 'O', chr(197) . chr(143) => 'o',
      chr(197) . chr(144) => 'O', chr(197) . chr(145) => 'o',
      chr(197) . chr(146) => 'OE', chr(197) . chr(147) => 'oe',
      chr(197) . chr(148) => 'R', chr(197) . chr(149) => 'r',
      chr(197) . chr(150) => 'R', chr(197) . chr(151) => 'r',
      chr(197) . chr(152) => 'R', chr(197) . chr(153) => 'r',
      chr(197) . chr(154) => 'S', chr(197) . chr(155) => 's',
      chr(197) . chr(156) => 'S', chr(197) . chr(157) => 's',
      chr(197) . chr(158) => 'S', chr(197) . chr(159) => 's',
      chr(197) . chr(160) => 'S', chr(197) . chr(161) => 's',
      chr(197) . chr(162) => 'T', chr(197) . chr(163) => 't',
      chr(197) . chr(164) => 'T', chr(197) . chr(165) => 't',
      chr(197) . chr(166) => 'T', chr(197) . chr(167) => 't',
      chr(197) . chr(168) => 'U', chr(197) . chr(169) => 'u',
      chr(197) . chr(170) => 'U', chr(197) . chr(171) => 'u',
      chr(197) . chr(172) => 'U', chr(197) . chr(173) => 'u',
      chr(197) . chr(174) => 'U', chr(197) . chr(175) => 'u',
      chr(197) . chr(176) => 'U', chr(197) . chr(177) => 'u',
      chr(197) . chr(178) => 'U', chr(197) . chr(179) => 'u',
      chr(197) . chr(180) => 'W', chr(197) . chr(181) => 'w',
      chr(197) . chr(182) => 'Y', chr(197) . chr(183) => 'y',
      chr(197) . chr(184) => 'Y', chr(197) . chr(185) => 'Z',
      chr(197) . chr(186) => 'z', chr(197) . chr(187) => 'Z',
      chr(197) . chr(188) => 'z', chr(197) . chr(189) => 'Z',
      chr(197) . chr(190) => 'z', chr(197) . chr(191) => 's',
      // Euro Sign
      chr(226) . chr(130) . chr(172) => 'E'
    );
  }
 
  // transliterate
  return strtr($string, $charmap);
}
 
function is_slug($str) {
  return $str == slug($str);
}
Hugo
  • 22,841
  • 6
  • 67
  • 86
kikito
  • 48,656
  • 29
  • 134
  • 183
6

There's also this from d7 which you can copy to your project:

http://api.drupal.org/api/function/drupal_clean_css_identifier/7

sprugman
  • 17,781
  • 31
  • 105
  • 160
  • It is nice to know that drupal has a function like this. However it doesn't do everything I needed (see my other answers). But +1 for the research effort. – kikito May 20 '10 at 07:33
2

This might help, I find I am doing this slugging all the time now rather then use id numbers as unique keys in my tables.

    /** class SlugMaker
    * 
    * methods to create text slugs for urls
    *
    **/

class SlugMaker {

    /** method slugify
    * 
    * cleans up a string such as a page title
    * so it becomes a readable valid url
    *
    * @param STR a string
    * @return STR a url friendly slug
    **/

    function slugifyAlnum( $str ){

    $str = preg_replace('#[^0-9a-z ]#i', '', $str );    // allow letters, numbers + spaces only
    $str = preg_replace('#( ){2,}#', ' ', $str );       // rm adjacent spaces
    $str = trim( $str ) ;

    return strtolower( str_replace( ' ', '-', $str ) ); // slugify


    }


    function slugifyAlnumAppendMonth( $str ){

    $val = $this->slugifyAlnum( $str );

    return $val . '-' . strtolower( date( "M" ) ) . '-' . date( "Y" ) ;

    }

}

Using this and .htaccess rules means you go directly from a url like:

/articles/my-pops-nuts-may-2010

Straight through to the table look up without having to unmap IDs (applying a suitable filter naturally).

Append or prepend some kind of date optionally in order to enforce a degree of uniqueness as you wish.

HTH

Cups
  • 6,577
  • 3
  • 24
  • 29
  • Thanks for posting this. The only thing I don't like about this function is that it leaves the separators at the begining and end of identifiers; if you have something like `#1 - Option 1` it will get transformed into `-1-option-1`, which is not safe for use on css. A minor thing is that it doesn't transliterate. – kikito May 20 '10 at 07:29
  • Upvote for example URL `/articles/my-pops-nuts-may-2010` – JamesWilson Sep 03 '16 at 00:22
1

I would recommend the transliteration module which path_auto uses. With it you can use the transliteration_get() function. It also does unicode transformation.

googletorp
  • 32,389
  • 15
  • 62
  • 81
  • 2
    pathauto does not use the transliteration module. it uses its own function pathauto_cleanstring() which depends on loads of settings from pathauto. http://drupalcontrib.org/api/function/pathauto_cleanstring/6 – Capi Etheriel May 19 '10 at 12:47
  • @barraponto You can make pathauto use it to handle unicodes in urls, which is doesn't handle very well otherwise. – googletorp May 19 '10 at 13:02
  • how do i get pathauto to use transliteration module? i've been looking for that... http://stackoverflow.com/questions/2865742/how-to-use-pathauto-and-transliteration-modules-together – Capi Etheriel May 19 '10 at 13:11
  • Thanks for posting this. However, I'll avoid adding additional modules if possible. I was looking for something provided directly by Drupal or PHP. – kikito May 20 '10 at 07:30
  • i bet you are already using pathauto. it has a built-in transliteration file (i18n-ascii.txt) which will provide transliteration in pathauto_cleanstring(). you do NOT need the transliteration module. – Capi Etheriel May 20 '10 at 15:28
  • @barraponto, Transliteration is my recommendation, I haven't said that it's required or the only way. – googletorp May 20 '10 at 15:35
  • @googletorp, i noticed, i just wanted to make it clear that transliteration is not needed since egarcia discarded your answer because of extra modules. pathauto_cleanstring () is enough, transliteration module is great for file paths (which pathauto does not handle). i highly recommend it. – Capi Etheriel May 20 '10 at 16:12
1

For Drupal 8/9 you can use Html::getClass

$slugify = Html::getClass('A @ Stríng-that n+eeds cónvert');

Don't forget to include the namespace when needed inside module

use Drupal\Component\Utility\Html;
Matthias O.
  • 174
  • 1
  • 2
  • 13
0

You can use a preg_replace and strtolower :

preg_replace('/[^a-z]/','-', strtolower($term->name)); 
Brice Favre
  • 1,493
  • 15
  • 33
  • This is clean and simple. Unfortunately it doesn't do everything I need. But thanks for answering. – kikito May 20 '10 at 07:33
  • I just found that basic theme implements what you're looking for this way : $string = strtolower(preg_replace('/[^a-zA-Z0-9_-]+/', '-', $string)); – Brice Favre May 27 '10 at 10:45
0

This is what worked for me after a lot of trial and error, including for converting both French as German titles with special characters to a slug.

I Created a custom twig filter so I can use it like this:

{{ node.field_title.value|slug }}

It will convert:

Wärmeabgabe & Abmessungen
Typenübersicht
Montage- und Anschlussmaße

Into:

warmeabgabe--abmessungen
typenubersicht
montage--und-anschlussmasse

for example.

HOWTO: In a custom module, create a services.yml file: modules/custom/mymodule/mymodule.services.yml

services:
 mymodule.twig_extensions:
    class: Drupal\mymodule\HelperTwigExtensions
    tags:
      - { name: twig.extension }

Create the modules/custom/mymodule/src/HelperTwigExtensions.php file:

<?php

namespace Drupal\mymodule;

use Drupal\Component\Utility\Html;

/**
 * Extend Drupal's Twig_Extension class.
 */
class HelperTwigExtensions extends \Twig_Extension {

  /**
   * {@inheritdoc}
   */
  public function getName() {
    return 'mymodule.twig_extensions';
  }

  /**
   * {@inheritdoc}
   */
  public function getFilters() {
    return [
      new \Twig_SimpleFilter('slug', [$this, 'createSlug']),
    ];
  }

  /**
   * Create a slug from a string input.
   */
  public function createSlug($input) {
    // Convert most of the special characters.
    $slug = Html::getClass($input);
    $slug = strtolower($slug);
    // Convert accented text characters.
    $unwanted_array = [
      'Þ' => 'b',
      'ß' => 'ss',
      'à' => 'a',
      'á' => 'a',
      'â' => 'a',
      'ã' => 'a',
      'ä' => 'a',
      'å' => 'a',
      'æ' => 'a',
      'ç' => 'c',
      'è' => 'e',
      'é' => 'e',
      'ê' => 'e',
      'ë' => 'e',
      'ì' => 'i',
      'í' => 'i',
      'î' => 'i',
      'ï' => 'i',
      'ð' => 'o',
      'ñ' => 'n',
      'ò' => 'o',
      'ó' => 'o',
      'ô' => 'o',
      'õ' => 'o',
      'ö' => 'o',
      'ø' => 'o',
      'ù' => 'u',
      'ú' => 'u',
      'û' => 'u',
      'ü' => 'u',
      'ý' => 'y',
      'þ' => 'b',
      'ÿ' => 'y',
    ];
    $slug = strtr($slug, $unwanted_array);
    return $slug;
  }

}
Flyke
  • 209
  • 1
  • 4